Reputation
Badges 1
25 × Eureka!Hmm so there is a way to add callbacks (somewhat cumbersome, and we would live feedback) so you can filter them out.
What do you think, would that work?
Could you test with the latest "cleaml"pip install git+
Task.add_requirement(".") should be supported now 🙂
Hi WackyRabbit7
So I'm assuming after the start_locally
is called ?
Which clearml version are you using ?
(just making sure, calling Task.current_task()
before starting the pipeline returns the correct Task?)
and I have no way to save those as clearml artifacts
You could do (at the end of the codetask.upload_artifact('profiler', Path('./fil-result/'))
wdyt?
Hi RoundMosquito25
What do you mean by "local commits" ?
Hi @<1668427971179843584:profile|GrumpySeahorse51>
Could you provide the full stack log?
this erros seems to originate from psutil (which is used) but it lacks the clearml-session context
you mean in the enterprise
Enterprise with the smarter GPU scheduler, this is inherent problem of sharing resources, there is no perfect solution, you either have fairness, but then you get idle GPU's of you have races, where you can get starvation
EmbarrassedSpider34 I can update that an RC should be out later today with a fix 🙂
So the only difference is how I log in into machine to start clear-ml
the only different that I can think of is the OS Environments in the two login types:
can you run export
in the two cases and check the diff between them?export
I did nothing to generate a command-line. Just cloned the experiment and enqueued it. Used the server GUI.
Who/What created the initial experiment ?
I noticed that if I run the initial experiment by "python -m folder_name.script_name"
"-m module" as script entry is used to launch entry points like python modules (which is translated to "python -m script")
Why isn't the entry point just the python script?
The command line arguments are passed as arguments on the Args section of t...
clearml==0.17.4
Great, and as you pointed changes are detected.
If I make some changes in the original file it gets tracked (picture below)... but what if I use a completely new and git-untracked py script?
Basically it will capture the output of "git diff" so if you have a new file but it is Not added, then no changes will appear.
That said, this is usually the exception, most files that are not added to the git should not be logged, hence the logic.
Anyhow the idea is that t...
You can make reports on experiments with interactive graphs
Yes, I can totally see how this is a selling point. The closest is the Project Overview (full markdown capabilities, with the ability to embed links to specific experiments). You can also add a "leader metric", so you can track the project performance/progress over time.
I have to admit that creating a better reporting tool is always pushed down in priority as I think this is a good selling point to management but the actual ...
ShallowCat10 try something similar to this one, due notice that it might take a while to get all the task objects, so I would start with a single one 🙂
`
from trains import Task
tasks = Task.get_tasks(project_name='my_project')
for task in tasks:
scalars = task.get_reported_scalars()
for x, y in zip(scalars['title']['original_series']['x'], scalars['title']['original_series']['y']):
task.get_logger().report_scalar(title='title', series='new_series', value=y, iteration=...
PompousBeetle71 cool, next RC will have the argparse exclusion feature :)
it looks like nvidia is going to come up with an UI for TAO too
Interesting, any reference we could look at ?
Hi ThoughtfulBadger56
If I clone and enqueue the cloned task on the webapp, does the clearml server execute the whole cmd above?
You mean agent will execute it? Do you have Task.init inside your code ?
My only point is, if we have no force_git_ssh_port
or force_git_ssh_user
we should not touch the SSH link (i.e. less chance of us messing with the original URL if no one asked us to)
Clearml automatically gets these reported metrics from TB, since you mentioned see the scalars , I assume huggingface reports to TB. Could you verify? Is there a quick code sample to reproduce?
We should probably change it so it is more human readable 🙂
What happened in the server configuration that all of a sudden you have zero ports open?
PungentLouse55 hmmm
Do you have an idea on how we could quickly reproduce it?
it will constantly try to resend logs
Notice this happens in the background, in theory you will just get stderr messages when it fails to send but the training should continue
Yes docker was not installed in the machine
Okay make sense, we should definitely check that you have docker before starting the daemon 😉
Ok, it would be nice to have a --user-folder-mounted that do the linking automatically
It might be misleading if you are running on k8s cluster, where one cannot just -v mount
volume...
What do you think?
CooperativeFox72 this is indeed sad news 😞
When you have the time, please see if you can send a code snippet to reproduce the issue. I'd like to have it fixed
Thanks CooperativeFox72 ! I'll test and keep you posted 🙂