ShakyJellyfish91 can you check if version 1.0.6rc2
can find the changes ?
if you have cuda 10.2, then the torch 1.3.1 from the cu101 version should work
Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder
- Artifacts and models will be uploaded to the output URI, debug images are uploaded to the default file server. It can be changed via the Logger.
- Hmm is this like a configuration file?
You can do.
local_text_file = task.connect_configuration('filenotingit.txt')
Then open the 'local_text_file' it will create a local copy of the data in runtime, and the content will be stored on the Task itself. - This is how the agent installs the python packages, but if the docker already contactains th...
TenseOstrich47 / PleasantGiraffe85
The next version (I think releasing today) will already contain scheduling, and the next one (probably RC right after) will include triggering. That said currently the UI wizard for both (i.e. creating the triggers), is only available in the community hosted service. That said I think that creating it from code (triggers/schedule) actually makes a lot of sense,
pipeline presented in a clear UI,
This is actually actively worked on, I think Anxious...
Hi PanickyLion56
Yep savefig also works, you can also do,from clearml import Logger Logger.current_logger().report_matplotlib_figure(title="My Plot Title", series="My Plot Series", iteration=10, figure=plt)
https://github.com/allegroai/clearml/blob/0c5d12b830987aa9bb8d44d81e92ff9198008f29/examples/frameworks/matplotlib/matplotlib_example.py#L25
When exactly are you getting this error ?
VictoriousPenguin97 basically spin down sereverA (this should flush all DBs) then copy /opt/clearml to the new server and spin it with docker-compose. As long as the new server is on the same address as the previous one, everything should work out of the box
. And I saw that it upload the notebook it self as notebook. Does it is normal? There is a way to disable it?
Hi FriendlyElk26
Yes this is normal, it backups your notebook as well as converts it into python code (see "Execution - uncommitted changes" so that later the clearml-agent will be able to run it for you on remote machines.
You can also use task.connect({"param": "value")
to expose arguments to use in the notebook so that later you will be able to change them from the U...
We are working on 1.3.0 so this is right in time
So you are saying it ignored everything after the bucket's "/" ?
It should have been:output_uri="s3://company-clearml/artifacts/bethan/sales_journeys/artifacts/examples/load_artifacts.f0f4d1cd5eb54795b11508dd1e739145/artifacts/filename.csv.gz/filename.csv.gz
Quick update, I might have been able to reproduce the issue ( GreasyPenguin14 working "offline" is a great hack to accelerate debugging this issue, thank you!)
It seems it is related to the known and very annoying Python forking issue (and this is why changing to "spawn" method solves the issue):
https://bugs.python.org/issue6721
Long story short, in some cases when forking (i.e. ProcessPoolExecutor), python can copy locks in a "bad" state, this means that you can end up with a lock acquir...
try:
None
docker_install_opencv_libs: true
ReassuredTiger98 regrading the agent error, can you see the package some_packge
in the "Installed Packages" in the UI? Was it installed ? are you using pip or conda as package manager in the agent (check the clearml.conf) is the agent running in docker mode ?
That being said it returns none for me when I reload a task but it's probably something on my side.
MistakenDragonfly51 just making sure, you did call Task.init, correct ?
What duesfrom clearml import Task task = Task.current_task()
returns ?
Notice that you need to create the Task before actually calling Logger.current_logger()
or Task.current_task()
others from the local environment and this causes a conflict when importing the attr module
Inside the docker ? " local environment" ?
This is all under "root" no?
Thanks!
Hmm from here : None
Could it be you do not have privileges to the resource, or that you did not provide credentials ?
Did that autoscaler work before ?
That is odd ...
Could you open a GitHub issue?
Is this on any upload, how do I reproduce it ?
In our case this is not possible due to client security (e.g. training data from clients can potentially be 'reverse engineered' from trained models in future).
Hmm I see, wouldn't it make more sense to separate clients like a multi-tenant SAAS solution ?
Hi RobustFlamingo1
The ClearML Orchestrator looks interesting. But the website suggests that K8S is required
No k8s is not a must, only an option 🙂
We have a Linux training box (LambdaBox) where we want to run training. Can we place the ClearML orchestrator agent on the machine without needing K8S?
Yes should be quite easy.
If you intent to use containers, make sure you have docker installed.
Then just pip install clearml-agent
and configure it:
https://clear.ml/doc...
I did change the
instead of 8080?
So this is the issue
Btw I sometimes get a gzip error when I am accessing artefacts via the '.get()' part.
Hmm this is odd, is this a download issue? if this is reproducible maybe we should investigate further...
Hi @<1598487094601191424:profile|MysteriousCow84>
only one of them uses an already created venv from cache for this task. And the other node starts to re-create the same virtual environment.
Just be clear, the second one is running, but it does not use the same venv as the other one (that is running in parallel), is that correct?
I was wondering about what i can do with the agent's argparse magic
You mean how to pass arguments to components a pipeline? btw did you check the pipeline example here?
None