
Reputation
Badges 1
25 × Eureka!AdventurousRabbit79 are you passing cache_executed_step=False
to the PipelineController ?
https://github.com/allegroai/clearml/blob/332ceab3eadef4997e897d171957975a247a6dc1/clearml/automation/controller.py#L129
Could you send a usage example ?
my pipeline controller always updates to the latest git commit id
This will only happen if the Task the pipeline creates has no specific commit ID, and instead just uses the latest from the git repo. Is this the case ?
Thanks MagnificentSeaurchin79 !
Let me check what's the status with this one, could it be the same as this one?
https://github.com/allegroai/clearml/issues/322
No (this is deprecated and was removed because it was confusing)
https://github.com/allegroai/clearml-agent/blob/cec6420c8f40d92ab1cd6cbe5ca8f24cf351abd8/docs/clearml.conf#L101
How did you define the decorator of "train_image_classifier_component" ?
Did you define:@PipelineDecorator.component(return_values=['run_model_path', 'run_tb_path'], ...
Notice two return values
I understand that it uses time in seconds when there is no report being logged..but, it has already logged three times..
Hmm could it be the reporting started 3 min after the Task started ?
I'm thinking of a few plots in my current in-house tooling which are slightly different than the standard charts we look at. For example a custom parallel coordinate chart that can use aggregations, categorical variables, etc.
This can be done by comparing experiments, then check the Hyper-Parameters tab, and select graph from the drop down at the top
So my question in general is pertaining to if I would need to get better at Javascript if I were to make those changes. My guess is ...
Hi @<1695969549783928832:profile|ObedientTurkey46>
Use --services-mode in the agent , it will run many Tasks on the same machine, this is usually associated with the services queue, but can be run on any queue. This way you could have the same machine easily running those multiple "control" tasks.
wdyt?
MinuteGiraffe30 if you are running the following command while your current directory is where you code is, what are you getting?
$ git ls-remote --get-url origin
This is odd, Can you send the full Task log? (remove any pass/user/repo that you think is sensitive)
I think my main point is, k8s glue on aks or gke basically takes care of spinning new nodes, as the k8s service does that. Aws autoscaler is kind of a replacement , make sense?
JitteryCoyote63 the agent.cuda_version
(or CUDA_VERSION env) tell the agent which pytorch wheel to download. CUDNN library can be included inside any wheel and it will work as long as the cuda / cudart exist on the system, for example pytorch wheels include the cudnn they use . agent.cudnn_version
should actually be deprecated, and is not actually used.
For future reference, dependency order:
Nvidia Drivers CUDA library and CUDA-runtime libraries (libcuda.so / libcudart.so) CUDN...
Can you send the full log? This is odd, it will by default use the python executable it (the agent) is running with.
Regardless you can specify the python executable to be used here:
https://github.com/allegroai/clearml-agent/blob/bd411a19843fbb1e063b131e830a4515233bdf04/docs/clearml.conf#L44
Seems like passing the Task object is not working as expected (I'll make sure it is fixed).
Try:dataset._task.set_parent(Task.current_task().id)
Yey!
My pleasure 🙂
it seems it's following the path of the script i'm using to task.create, eg:
The folder it should run it is the script path you are passing (i.e. "script=ep_fn," )
Wrong path would imply that is it not finding the correct repository, is that the case ?
Hmm, Notice that it does store sym links to parent data versions (to save on multiple copies of the same file). If you call get_mutable_local_copy() you will get a standalone copy
Hi SlipperyDove40
plotly is about 4Mb... trains about 0.5MB what'd the breakdown of the packages ? This seems far away from 250Mb limit
Hi Guys,
I hear you guys, and I know this is planned but probably bump down priority.
I know the main issue is the "Execution Tab" comparison, the rest is not an issue.
Maybe a quick Hack to only compare the first 10 in the Execution, and remove the limit on the others ? (The main isue with the execution is the git-diff / installed packages comparison that is quite taxing on the FE)
Thoughts ?
Hi UnsightlyShark53 I think you are absolutely right, there is no reason for the trains.errors.UsageError: ArgumentParser.parse_args() ...
Error.
As you mentioned, if auto_connect_arg_parser=False
is False, it should just ignore what it picked automatically.
I will make sure the error is resolved I will also make sure, you will still be able to connect the argparse manually with task.connect(parser)
after the Task has been created. Thanks for the reference! I took a look o...
Hi RoughTiger69
but still get the semantics of knowing when an (external) file changed?
How would you know it changed?
This implies you have a way to verify hash, which means you download the data , no?
The bug was fixed 🙂
Task.running_localy()
Should do the trick
That works AND the feature works!
YEY
Quick follow up question, is there any way to abort a pipeline and all of the tasks it ran?
Hmm yes currently if you abort the pipeline is has no "time" to abort the running Tasks (the DAG itself will stop, because the pipeline controller was aborted, bit the running Tasks will continue).
In order to have better support, we need to add a previously requested feature for "abort" callback. This is actually not as straight forward as it sound...
somehow set docker_args and docker_bash_setup_script equivalent??
task.set_base_docker(...)# somehow setup repo and branch to download to remote instance before running
This is automatically detected based on your local commit/branch as well ass uncommitted changes
Hi RobustRat47
What do you mean by "log space for hyperparameter" , what would be the difference ? (Notice that on the graph itself you can switch to log scale when viewing in the UI) ?
Or are you referring to the hyper parameter optimization, allowing you to add log space ?