ResponsiveCamel97
BTW: any reason not to allow this flexibility ?
Okay, I was able to reproduce it (this is odd) let me check ...
Thanks for the ping ConvolutedChicken69 , I missed it 😞
from what i see in the docs it's only for Jupyter / VS Code, i didn't see anything about pycharm
PyCharm is basically SSH, which is supported 🙂
(Maybe we should mention it in the docs?)
Hi GreasyPenguin14
It looks like you are trying to delete a Task that does not exist
Any chance the cleanup service is misconfigured (i.e. accessing the incorrect server) ?
JitteryCoyote63 There is a basic elastic license that should always be there. If for some reason it was deleted/expired then the following command should fix it:
curl -XPOST ' http://localhost:9200/_xpack/license/start_basic '
Failing when passing the diff to the git command...
where is it running? could you restart all the dockers ? Is it running on your machine?
It should also work with host IP and two docker compose files.
I'm not sure where to push a for a unified docker compose?
Hi MagnificentSeaurchin79
Yes this is a bit confusing 🙂
Datasets are stored as delta changes from parent versions.
A dataset contains a list of files and list of artifacts where these files exist. This means that if we add a new file to a dataset we create a new dataset from a parent dataset and want to add a file, we have to add a link to the file, and have a new artifact containing just the delta (i.e. the new file) from the parent version When you delete a file you just remove the li...
Hi SoreDragonfly16
The warning you mention means that someone state of the experiment was changed to aborted , which in term will actually kill the process.
What do you mean by "If I disable the logger," ?
GrumpyPenguin23 could you help and point us to an overview/getting-started video?
Hi MagnificentSeaurchin79
Unfortunately there is currently no way to reorder the plots, but you have a valid point. I suggest a GitHub UX issue ?
Regrading the debug samples, the difference is that the confutation matrix report is actually metadata, you can get these numbers by the API or the download, but the debug samples are static images ...
BTW: you can try to produce an interactive side by side confusion matrix with plotly, and use report_plotly_figure
Thanks MagnificentSeaurchin79 !
Let me check what's the status with this one, could it be the same as this one?
https://github.com/allegroai/clearml/issues/322
Clearml automatically gets these reported metrics from TB, since you mentioned see the scalars , I assume huggingface reports to TB. Could you verify? Is there a quick code sample to reproduce?
OddAlligator72 let's separate the two issues:
Continue reporting from a previous iteration Retrieving a previously stored checkpointNow for the details:
Are you referring to a scenario where you execute your code manually (i.e. without the trains-agent) ?
HI BurlyRaccoon64
Yes, we did the latest clearml-agent solves the issue, please try:
'pip3 install -U --pre clearml-agent'
Correct, and that also means the code the runs is not auto-magically logged.
Hi WorriedParrot51
Take a look at the Experiment execution section:
there is script and working directory
working directory is the base of the git repository (which is cloned into the docker file)
So if for some reason trains did not properly detect the current working dir here is what should solve the issue, without changing the PYTHONPATH
script path: ./sub_folder/scripy.py working directory: .
What do you think?
, it's just a custom module.
Is this your own module ? Is this a local folder we import from ?
Basically you create the Task and make sure the "Dataset" is attached to it:task = Task.init(...) dataset = Dataset.create(task=task) dataset.add_files(...)This will make sure the code is attached to the Dataset
Actually, no. This is ti spin the clearml-server on GCP, not the agent
i think it can only run on multiple GPU at one node
Okay, the first step is to make sure your code is multi-node enabled, there is no magic for that 🙂
in order to work with ssh cloning, one has to manually install openssh-client to the docker image, looks like that
Correct, you have to have SSH inside the container so that git can use it.
You can always install with the following setup inside your agent's clearml.conf:extra_docker_shell_script: ["apt-get install -y openssh-client", ]
https://github.com/allegroai/clearml-agent/blob/73625bf00fc7b4506554c1df9abd393b49b2a8ed/docs/clearml.conf#L145
Sorry @<1657918706052763648:profile|SillyRobin38> I missed this reply
Is ClearML-Serving using either System or CUCA shared memory? O
This needs to be set on the docker-compose:
and I think this line actually includes ipc: host which means there is no need to set the shm_size, but you can play around with it and let me know if you see a difference
[None](https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose-triton-gpu.yml#L1...
JitteryCoyote63 any chance you have a log of the failed torch 1.7.0 ?
Thanks!
fyi: This section is not necessary if you you have clearml.conf file in ~/Task.set_credentials( api_host=" ", web_host=" ", files_host=" ", key='********************', secret='***********************' )Let me check the code for a min
Can you post them, I think there is something there that prevents the update (i.e. pip related).
For example:packagename @ git+https:///....Will be translated by pip to:
If packagename is installed do nothing, if it is not installed use git+https://... to install it
Yes it fully supported, and should work.
Could you share the full execution log ?