Hi @<1556450111259676672:profile|PlainSeaurchin97>
Is there any simple way to use
argparse
to pass a clearml task name?
need to call
args = task.connect(args)
.
noooo 🙂 there is no need to do that, the arguments are automatically detected
see for yourself
args = parse_args()
task = Task.init(task_name=args.task_name)
Follow up: I see that if I move an Experiment to a new project, it does not copy the associated model files and must be done manually. Once I moved the models to the new project, the query works as expected.
Correct 🙂
Nice catch!
Hi DepressedFish57
In my case download each part takes ~5 second, and unzip ~15.
We run into that, and the new version will employ multithreading approach for the unzip (meaning the unzipping will happen in the background)
` from time import sleep
from clearml import Task
import tqdm
task = Task.init(project_name='debug', task_name='test tqdm cr cl')
print('start')
for i in tqdm.tqdm(range(100)):
sleep(1)
print('done') `The above example code will output a line every 10 seconds (with the default console_cr_flush_period=10) , can you verify it works for you?
This points to the wrong cu117 / driver - could that be?
Hi JitteryCoyote63 a few implementation details on the services-mode, because I'm not certain I understand the issue.
The docker-agent (running in services mode) will pick a Task from the services queue, then it will setup the docker for it spin it and make sure the Task starts running inside the docker (once it is running inside the docker you will see the service Task registered as additional node in the system, until the Task ends) once that happens the trains-agent will try to fetch the...
pytorch DDP
with what backend ? gloo ? nvcc ? openmpi ?
Hi CleanPigeon16
Put the specific git into the "installed packages" section
It should look like:... git+
...
(No need for the specific commit, you can just take the latest)
Thanks ShakyJellyfish91 ! please let me know what you come up with, I would love for us to fix this issue.
Yep, automatically moving a tag
No, but you can get the last created/updated one with that tag (so I guess the same?)
meant like the best artifacts.
So artifacts get be retrieved like a dict:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts_retrieval.pyTask.get_task(project_name='examples', task_name='artifacts example').artifacts['name']
Do I set theÂ
CLEARML_FILES_HOST
 to the end point instead of an s3 bucket?
Yes you are right this is not straight forward:CLEARML_FILES_HOST="
s3://minio_ip:9001 "
Notice you must specify "port" , this is how it knows this is not AWS. I would avoid using an IP and register the minio as a host on your local DNS / firewall. This way if you change the IP the links will not get broken 🙂
Hi RipeGoose2
Any logs on the console ?
Could you test with a dummy example on the demoserver ?
Could you see if that makes a difference ?
AdventurousRabbit79 you mean like minio / ceph ?
like what all are important metric monitoring queries w.r.t. the serving tasks that can be visualized and shown in grafana?
Basically latency amd requests per minute are automatically reported. Additional reports are based on your RestAPI in/out.
Imagine the following restapi request json payload
{x=123, y=456}
and a return json of
{z=789}
The metrics you can add to the monitoring are the keys on both these jsons, i.e. "x", "y", "z"
These metrics can be both log...
Hi HelplessCrocodile8
yes there is:
in the first case, the new_key
will be automatically logged:a_dict = {} a_dict = task.connect(a_dict) a_dict['new_key'] = 42
In the second example changes to the "object" passed to connect are not tracked
make sense ?
Hi @<1533619725983027200:profile|BattyHedgehong22>
Can you elaborate ? what do you mean params file ?
Is this something like:
Task.current_task().connect_configuration('my_conf.json', name="my conf file")
And having a pdf is easier/better than sharing a link to the results page ?
UnsightlySeagull42 the assumption is that the agent has a read-only all access user.
As the moment there is no way to configure it to have diff user/pass per repository in the clearml.conf
You can however:
embed the user/pass on the repository link (not very secure) Use ssh-key and have it on .ssh on the host machine Use .git-credentials and configure them (with per project user/pass)
My apologies you are correct 1.8.1rc0 🙂
BTW: I think an easy fix could be:if running_remotely(): pipeline.start() else: pipeline.create_draft()
Hi StickyBlackbird93
Yes, this agent version is rather old ( clearml_agent v1.0.0
)
it had a bug where pytorch wheel aaarch broke the agent (by default the agent in docker mode, will use the latest stable version, but not in venv mode)
Basically upgrade to the latest clearml-agent version it should solve the issue:pip3 install -U clearml-agemnt==1.2.3
BTW for future debugging, this is the interesting part of the log (Notice it is looking for the correct pytorch based on the auto de...
BTW: the same hold for tagging multiple experiments at once
- In a notebook, create a method and decorate it by fastai.script’s
@call_parse
.Any chance you have a very simple code/notebook to reference (this will really help in fixing the issue)?
SuperiorPanda77 I have to admit, not sure what would cause the slowness only on GCP ... (if anything I would expect the network infrastructure would be faster)