Reputation
Badges 1
25 × Eureka!we need to evaluate the result across many random seeds, so each task needs to log the result independently.
Ohh that kind of makes sense to me π
Yes I'm also getting:
/usr/local/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 74 leaked semaphores to clean up at shutdown
len(cache))
Not sure about that ...
Hi NaughtyFish36
c++ module fails to import, anyone have any insight? required c++ compilers seem to be installed on the docker container.
Can you provide log for the failed Task?
BTW: if you need build-essentials
you can add it as the Task startup scriptapt-get install build-essentials
Plan is to have it out in the next couple of weeks.
Together with a major update in v0.16
Let me know if I understand you correctly, the main goal is to control the model serving, and deploy to your K8s cluster, is that correct ?
Hi @<1694157594333024256:profile|DisturbedParrot38>
Could you attach a full log? This is quite cryptic and does not ring a bell
OddAlligator72 I like this idea.
The single thing I'm not sure about is the "function entry point"
Why would one do that? Meaning why wouldn't you have a proper python entry-point.
The reason I'm reluctant is that you might have calls/functions/variables in global scope of the file storing the function, and then users will not know why something broke, ans it will be very cumbersome to debug.
A simple script entry point seems trivial to launch and debug locally.
What do you think ? What woul...
UnevenOstrich23
but interesting that auto-reload config does not working as I expected.
Unfortunately the trains-agent does not support auto reloading the config file yet. If you think this will be a great feature, please feel free to open a GitHub feature request issue π
Would be very cool if you could include this use case!
I totally think we should, any chance you can open an Issue, so this feature is not lost?
in the UI the installed packages will be determined through the code via the imports as usual ...
This is only in a case where a user manually executed their code (i.e. without trains-agent), then in the UI after they clone the experiment, they can click on the "Clear" button (hover over the "installed packages" to see it) and remove all the automatically detected packages. This will results in the trains-agent
using the "requirements.txt".
GiddyTurkey39
as others will also be running the same scripts from their own local development machine
Which would mean trains
` will update the installed packages, no?
his is why I was inquiring about theΒ
requirements.txt
Β file,
My apologies, of course this is supported π
If you have no "installed packages" (i.e. the field is empty in the UI) the trains-agent
will revert to installing the requirements.txt
from the git repo itself, then it...
If there a way to do this without manually editing installed packages?
Running your code once with Task.init
should automatically detect all the directly imported packages, then when trains-agent
executes the Task, it will install them to a clean venv and put back all the packages inside the venv.
In order for all the used packages (e.g. bigquery) to appear in the "Installed packages" your cide needs to be executed once manually (i.e. not with trains-agent) then the ` tra...
would I have to execute each task in the pipeline locally(but still connected to trains),
Somehow you have to have the pipeline step Task in the system, you can import it from code, or you can run it once, then the pipeline will clone it and reuse it. Am I missing something ?
Hi CrookedWalrus33
the python version is auto detected and register in "manual execution" time (i.e. when you run your code on your machine).
That said this is a suggestion for the agent, and only if it can actually find the matching Python version it will use it, otherwise it will use whatever is
available (i.e. Look through the PATH environment for a matching pythonX.Y
executable)
The easiest way to support would just make sure the python binary's path is added to the PATH env.
Does...
. So to conclude: it has to be executed manually first, then with trains agent?
Yes, that said, as you mentioned, you can always edit the "installed packages" once manually, from that point you are basically cloning the experiment, including the "installed packages" so it should work if the original worked.
Make sense ?
BTW: in your code, you should probably replacedataset_task = Task.get_task(task_id=dataset.id)
with:dataset_task = dataset._task
Hurrah Hurrah
Could I just build it and log these parameters using
task.set_parameters()
so that I call
task.get_parameters()
later?
instead of manually calling set/get, you call task.connect(some_dict_or_object)
, it does both:
When running manually (i.e. without an agent) it logs the keys/values on the Task,
when running with an agents, it takes the values from the backend (Task) and sets them on the dict/object
Make sense ?
EnviousStarfish54
plt.show will capture the figure, that if you call it multiple times, it will add a running number to the figure itself (because the figure might change, and you might want the history)
if you call plt.imshow, it's the equivalent of debug image, hence it will be shown in the debug-samples tab, as an image.
Make sense ?
what's the error/reply ?
Does it say it runs something ?
(on the workers tab on the agents table it should say which Task it is running)
Hi AverageBee39
It seems the json is corrupted, could that be ?
Hi BurlyRaccoon64
What do you mean by "custom_build_script" ? not sure I found it in "clearml,conf"
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf
Local changes are applied before installing requirements, right?
correct
I'm not sure if it matters but 'kwcoco' is being imported inside one of the repo's functions and not on the script's header.
Should work.
when you run pip freeze inside the same env what are you getting ?
Also, is there anyother import that is missing? (basically 'clearml' tryies to be smart, and see if maybe the script itself, even though inside a repo, is not actually importing anything from the repo itself, and if this is the case it will only analyze the original script. Basically...
You need to use tf.summary.image and not summary_ops_v2.image
Fixed on main branch (see github issue), RC later today
Image needs to be in range [0, 1] and not [0, 255] (matplotlib and tensorboard can handle either one)
Is there a code to reproduce ?
And your ~/clearml,conf ?
MotionlessCoral18 I think there is a fix in the latest clearml-agent RC 1.4.0rc0 can you test and update if your are still having this issue?
. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
The odd thing is that you can access the notebook, but it returns zero kernels ..