Reputation
Badges 1
25 × Eureka!SkinnyPanda43 issue verified, this seems to be related to python 3.9 and subprocesses.
Let me check what we can do
LOL totally 🙂
you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32
@<1562610699555835904:profile|VirtuousHedgehong97>
source_url="s3:...",
This means your data is already on S3 bucket, it will not "upload" it it will just register it.
If you want to upload files, then they should be local and then when you call upload you can specify the target S3 bucket, and the data will be stored in a unique folder in the bucket
Does that make sense ?
DilapidatedDucks58 I'm assuming clearml-server 1.7 ?
I think both are fixed in 1.8 (due to be released wither next week, or the one after)
Check the examples on the github page, I think this is what you are looking for 🙂
https://github.com/allegroai/trains-agent#running-the-trains-agent
Yes it should
here is fastai example, just in case 🙂
https://github.com/allegroai/clearml/blob/master/examples/frameworks/fastai/fastai_with_tensorboard_example.py
Also, how do pipelines compare here?
Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.
the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,
This is the exact idea of the TriggerScheduler None
What am I missing here?
Hi @<1658281093108862976:profile|EncouragingPenguin15>
Should work, I'm assuming multiple nodes are running agents ? or are you saying Ray spins the jobs and clearml logs them ?
Hi IrritableJellyfish76
If you are running a code that uses clearml from kubeflow, you have out of the box integration between the two, what am I missing?
Local changes are applied before installing requirements, right?
correct
Yes
Are you trying to upload_artifact to a Task that is already completed ?
TrickyRaccoon92 Thanks you so much! 😊
The only weird thing to me is not getting any "connection warnings" if this is indeed a network issue ...
Hi ScaryLeopard77
I think the error message you are getting is actually "passed" from Triton. Basically someone needs to tell it what the Model in/out look like (matrix size/type) this is essentially the content of the "config.pbtxt" , and this has to be set when spinning the model endpoint. does that make sense to you?
I just think that the create function should expect
dataset_name
to be None in the case of
use_current_task=True
(or allow the dataset name to differ from the task name)
I think you are correct, at least we should output a warning that it is ignored ... I'll make sure we do 🙂
Why does ClearML hide the dataset task from the main WebUI?
Basically you have the details from the Dataset page, why should it be mixed with the others ?
If I specified a project for the dataset, I specifically want it there, in that project, not hidden away in some
.datasets
hidden sub-project.
This maybe a request for "Dataset" tab under project, why would you need the Dataset Task itself is the main question?
Not all dataset objects are equal, and perhap...
RoughTiger69 the easiest thing would be to use the override option of Hydra:parameter_override={'Args/overrides': '[the_hydra_key={}]'.format(a_new_value)})wdyt?
Correct,
Notice that the glue has it's own defaults and the ability to override containers from the UI
It said the command --aux-config got invalid input
This seems like an interface bug.. let me see if we can fix that 🙂
BTW: this seems like a triton LSTM configuration issue, we might want to move the discussion to the Triton server issue, wdyt?
Definitely!
Could you start an issue https://github.com/triton-inference-server/server/issues , and I'll jump join the conversation?
. Is there any reference about integrating kafka data streaming directly to clearml-serving...
Based on the log you have shared:OSError: [Errno 28] No space left on deviceI would increase the storage ?
https://github.community/t/github-actions-failing-with-errno-28-no-space-left-on-device/18164/10
https://stackoverflow.com/questions/70175977/multiprocessing-no-space-left-on-device
https://groups.google.com/g/ansible-project/c/4U6MyvyvthQ
I would start by increasing the size of the TMPDIR folder
FrustratingWalrus87 If you need active one, I think there is currently no alternative to TB tSNE 🙂 it is truly great 🙂
That said you can use plotly for the graph:
https://plotly.com/python/t-sne-and-umap-projections/#project-data-into-3d-with-tsne-and-pxscatter3d
and report it to ClearML with Logger report_plotly :
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/examples/reporting/plotly_reporting.py#L20
By your description it seems to make no difference whether I added the files via sync or add, since I will have to create a new dataset either way.
Sync is design to take a local folder/s and add/remove files from a dataset based on the local changes (it does that automatically based on file existence / content)
The changes (i.e. added files) are uploaded as delta changes relative to the parent version, this means we are not always uploading all files.
Add on the other hand means you...
well.. having the demo server by default lowers the effort threshold for trying ClearML and getting convinced it can deliver what it promises, and maybe test some simple custom use cases. I
This was exactly what we thought when we set it up in the first place 🙂
(I can't imagine the cost is an issue, probably maintenance/upgrades ...)
There is still support for the demo server, you just need to set the env key:CLEARML_NO_DEFAULT_SERVER=0 python ...
PompousBeetle71 Check the beginning of the log, it should print the configuration, including the access key (excluding the secret) see if it makes sense...
ElegantCoyote26 It means we need to have a keras logger that logs everything to trains, then we need to hook it automatically.
Do you feel like PR-ing the logger (the hooking I can take care of 🙂 )?
I'll make sure they get back to you
LudicrousParrot69 ,
Are you trying to post execution parse the attached Table, then put it into a CSV on the HPO Task ?
DefeatedMoth52 how many agents do you have running on the same GPU ?