Reputation
Badges 1
25 × Eureka!Hi! I was wondering why ClearML recognize Scikit-learn scalers as Input Models...
Hi GiganticTurtle0
any joblib.load/save is logged by clearml (it cannot actually differentiate what it is used for ...)
You can of course disable it with Task.init(..., auto_connect_frameworks={'joblib': False})
MuddySquid7 the fix was pushed to GitHub, you can now install directly from the repo:pip install git+
iβm working on creating a custom config with istio
That is awesome! let me know if we could help π
Also please consider PRing it, I'm sure other users will appreciate the option
If the same happens in venv mode, see if pip process actually finished (you can find it with ps -Af | grep pip
)
docker mode. they do share the same folder with the training data mounted as a volume, but only for reading the data.
Any chance they try to store the TensorBoard on this folder ? This could lead to "No such file or directory: 'runs'" if one is deleting it, and the other is trying to access, or similar scenarios
ElegantKangaroo44 definitely a bug, will be fixed in 0.15.1 (release in a week or so)
https://github.com/allegroai/trains/issues/140
Hi WittyOwl57
That's actually how it works (original idea/design was borrowed from libclound), basically you need to create a Drive, then the storage manger will use it.
Abstract class here:
https://github.com/allegroai/clearml/blob/6c96e6017403d4b3f991f7401e68c9aa71d55aa5/clearml/storage/helper.py#L51
Is this what you had in mind ?
Hi GreasyPenguin14
Sure you can, although a bit convoluted (I'll make sure we have a nice interface π )import hashlib title = hashlib.md5('epoch_accuracy_title'.encode('utf-8')).hexdigest() series = hashlib.md5('epoch_accuracy_series'.encode('utf-8')).hexdigest() task_filter = { 'page_size': 2, 'page': 0, 'order_by': ['last_metrics.{}.{}'.format(title, series)] } queried_tasks = Task.get_tasks(project_name='examples', task_filter=task_filter)
Thanks OutrageousGrasshopper93
I will test it "!".
By the way the "!" is in the project or the Task name?
Hi OutrageousGrasshopper93
I think that what you are looking for is Task.import_task and Task.export
https://allegro.ai/docs/task.html#trains.task.Task.import_task
https://allegro.ai/docs/task.html#trains.task.Task.export_task
I am running clearml-agent in docker mode btw.
Try -e PYTHONOPTIMIZE=1
in the docker args section, should do the same π
https://docs.python.org/3/using/cmdline.html#envvar-PYTHONOPTIMIZE
Hmm SuccessfulKoala55 any chance the nginx http was pushed to v1.1 on the latest cloud helm chart?
I cannot reproduce, tested with the same matplotlib version and python against the community server
You can control it with auto_ arguments in the Task.init call
https://clear.ml/docs/latest/docs/references/sdk/task#taskinit
. Perhaps it is the imports at the start of the script only being assigned to the first task that is created?
Correct!
owever when I split the experiment task out completely it seems to have built the cloned task correctly.
Nice!!
GiddyTurkey39 Just making sure, you ran ping IP
not ping ip:port
right ?
the first runs perfectly fine,
Just making sure, running in an agent?
the second crashes
Running inside the same container as the first one ?
The remaining problem is that this way, they are visible in the ClearML web UI which is potentially unsafe / bad practice, see screenshot below.
Ohhh that makes sense now, thank you π
Assuming this is a one time credntials for every agent, you can add these arguments in the "extra_docker_arguments" in clearml.conf
Then make sure they are also listed in: hide_docker_command_env_vars
which should cover the console log as well
https://github.com/allegroai/clearml-agent/blob/26e6...
Hi @<1523701066867150848:profile|JitteryCoyote63>
Could you please push the code for that version on github?
oh seems like it is not synced, thank you for noticing (it will be taken care immediately)
Regrading the issue:
Look at the attached images
None does not contain a specific wheel for cuda117 to x86, they use the pip defualt one
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F05744CK09L/screenshot...
Also SoreDragonfly16 could you test with if the issue exists with trains==0.16.2rc0
?
It might be that the worker was killed before unregistered, you will see it there but the last update will be stuck (after 10min it will be automatically removed)
Yes. Because my old
has never been resolved (though closed), we use the dataset object to upload e.g. local files needed for remote execution.
Ohh No I remember... following this line, can I assume these files are reused, i.e. this is not a "per instance" . I have to admit that I have a feeling this is a very unique usecase. and Maybe the "old" way Dataset were shown is better suited ?
No, I mean why does it show up in the task view (see attached image), forcing me to clic...
PompousBeetle71 let me know if it solves your problem
Hi JitteryCoyote63
The new pipeline is almost ready for release (0.16.2),
It actually contains this exact scenario support.
Check out the example, and let me know if it fits what you are looking for:
https://github.com/allegroai/trains/blob/master/examples/pipeline/pipeline_controller.py
Hmm SuccessfulKoala55 what do you think?
SubstantialElk6 is this the pip to install the agent, or the pip the agent is using to install the packages for the specific experiment ?
Are Kwargs supported in functions decorated as a pipeline component?
They are, but I think the main issue is the casting, without prior knowledge, everything will be a tring
TrickySheep9
Is there a way to see a roadmap on such thingsΒ
?Β (edited)
Hmm I think we have some internal one, I have to admit these things change priority all the time (so it is hard to put an actual date on them).
Generally speaking, pipelines with functions should be out in a week or so, TaskScheduler + Task Triggers should be out at about the same time.
UI for creating pipelines directly from the web app is in the working, but I do not have a specific ETA on that
WackyRabbit7 my apologies for the lack of background in my answer π
Let me start from the top, one of the goal of the trains-agent is to reproduce the "original" execution environment. Once that is done, it will launch the code and monitor it. In order to reproduce the original execution environment, trains-agent will install all the needed python packages, pull the code, and apply the uncommitted changes.
If your entire environment is python based, then virtual-environment mode is proba...