
Reputation
Badges 1
25 × Eureka!Hi DeliciousKoala34
This means the pycharm plugin was not able to run git on your local machine.
Whats your OS ?
could it be that if you open cmd / shell "git" is not in the path ?
JitteryCoyote63 This is odd you have both python3.9 and python3.8 on the container, and since it says (probably) ob the task the agent should run python3.9 it's trying to use it for creating the enthronement (it does not matter that agent is using python3.8).1673431344706 agent-1 DEBUG /usr/bin/python3.9 /usr/bin/python3.9: No module named pip
This is the main issue, pip is missing for python3.9 and this is why it reverted to python 3.8 when it was setting the environment.
It should prob...
It said the command --aux-config got invalid input
This seems like an interface bug.. let me see if we can fix that π
BTW: this seems like a triton LSTM configuration issue, we might want to move the discussion to the Triton server issue, wdyt?
Definitely!
Could you start an issue https://github.com/triton-inference-server/server/issues , and I'll jump join the conversation?
. Is there any reference about integrating kafka data streaming directly to clearml-serving...
how did you install trains?pip install git+
Why do you ask? is your server sluggish ?
Hi FancyWhale93 , in your clear.conf configure default output uri, you can specify the file server as default, or any object storage:
https://github.com/allegroai/clearml-agent/blob/9054ea37c2ef9152f8eca18ee4173893784c5f95/docs/clearml.conf#L409
Hi NastyFox63
This seems like most of the reports are converted to pngs (which is what the automagic does if it fails to convert the matplotlib into interactive plot).
no more than 114 plots are shown in the plots tab.
Are you saying we have 114 limit on plots ?
Is this true for "full screen" mode (i.e. not in the experiments table but switch to full detailed view)
I'm suggesting to make it public.
Actually I'm thinking of enabling users to register Drivers in runtime, expanding the capability to support any type of URL link, meaning you can register "azure://" with AzureDriver, and the StorageHelper will automatically use the driver you provide.
This will make sure Any part of the system will be able to transparently use any custom driver.
wdyt?
using the cleanup service
Wait FlutteringWorm14 , the cleanup service , or task.delete call ? (these are not the same)
Hi DeliciousBluewhale87
clearml-agent 0.17.2 was just release with the fix, let me know if it works
but out of curiosity, whats the point on doing a hyperparam search on the value of the loss on the last epoch of the experiment
The problem is that you might end up with global min that is really nice, but it was 3 epochs ago, and you have the last checkpoint ...
BTW, global min and last min should not be very diff if the model converge, wdyt?
Okay here is a standalone code that should be close enough? (if I missed anything let me know)
` import tempfile
from datetime import datetime
from pathlib import Path
import tensorflow as tf
import tensorflow_datasets as tfds
from clearml import Task
task = Task.init(project_name="debug", task_name="test")
(ds_train, ds_test), ds_info = tfds.load(
'mnist',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,
)
def normalize_img(image, labe...
So far my local and remote gitlab repositories are synchronized, I suspect, thatΒ
Failed applying git diff, see diff above
Β error is caused by cached repository from which clearml tries to run the process. I've cleaned the cache, but it haven't helped.
Hmm can you test with empty "uncommitted changes" ?
Just making sure when you say still does n't work, you are not trying to run the Task with the git diff that includes teh binary data right?
Yes, but where I can fi...
BoredHedgehog47 can you provide some logs, this is odd..
'config.pbtxt' could not be inferred. please provide specific config.pbtxt definition.
This basically means there is no configuration on how to serve the mode, i.e. size/type of lower (input) layer and output layer.
You can wither store the configuration on the creating Task, like is done here:
https://github.com/allegroai/clearml-serving/blob/b5f5d72046f878bd09505606ca1147d93a5df069/examples/keras/keras_mnist.py#L51
Or you can provide it as standalone file when registering the mo...
you can also just create a venv and run the tests there (with the latest python package) ?
WickedGoat98 the agent itself can be executed on bare metal, no need to setup a docker for it (although fully supported)
Specifically the docker compose has the docker running in services mode, i.e. for CPU light weight tasks such as running pipelines .
If the agent running on GPU, the easiest way to is run on bare metal
CooperativeFox72
Could you try to run the docker and then inside the docker try to do:su root whoami
Hi ChubbyLouse32
If I understand correctly you can relatively easy take a clearml Task and launch it on LSF, an integration would be something like:
` from clearml import Task
from clearml.backend_api.session.client import APIClient
while True:
result = client.queues.get_next_task(queue=q_id)
if not result or not result.entry:
sleep(5)
continue
task_id = result.entry.task
here is where we create the LSF job, this is just a pseudo code
os.system("lsf-launch-cmd 'clearml...
Sen the full Task log, you can DM it if it is easier
DrabCockroach54 notice here there is no aarch64 wheel for anything other than python 3.5...
(and in both cases only py 3.5/3.6 builds, everything else will be built from code)
https://pypi.org/project/pycryptodome/#files
Hi @<1523701601770934272:profile|GiganticMole91>
Do you mean something like a git ops triggered by PR / tag etc ?
Task.enqueue will execute immediately, i need execute task to spesific time
Oh I see what you mean, trigger -> scheduled (cron alike) -> Task executed.
Is that correct?
Oh I think that I understand what's going on, @<1523701260895653888:profile|QuaintJellyfish58> let me check how to update the cron scheduler while it is running (I really like this idea, so if this is not already supported I'l like us to add this capability π )
Hi UnsightlySeagull42
How can I reproduce this behavior ?
Are you getting all the console logs ?
Is it only the Tensorboard that is missing ?
I have to admit mounting it to a different drive is a good reason to bring this feature back, the reasoning was it means the agent needs to make sure it manages them (e.g. multiple agents running on the same machine)