PS. I just noticed that this function is not documented. I'll make sure it appears in the doc-string.
@<1570220844972511232:profile|ObnoxiousBluewhale25> it creates a new Model here
None
If you want it to log to something other than the default file server create the clearml Task before starting the training:
task = Task.init(..., outout_uri="file:///home/karol/data/")
# now training
It will use the existing Task and upload to the destination folder
I did nothing to generate a command-line. Just cloned the experiment and enqueued it. Used the server GUI.
Who/What created the initial experiment ?
I noticed that if I run the initial experiment by "python -m folder_name.script_name"
"-m module" as script entry is used to launch entry points like python modules (which is translated to "python -m script")
Why isn't the entry point just the python script?
The command line arguments are passed as arguments on the Args section of t...
Hi ClumsyElephant70
What's the clearml you are using ?
(The first error is a by product of python process.Event created before a forkserver is created, some internal python issue. I thought it was solved, let me take a look at the code you attached)
The additional edges in the graph suggest that these steps somehow contain dependencies that I do not wish them to have.
PanickyMoth78 I think I understand what you are saying, but it is hard to see if there is a "bug" here or a feature...
Can you post the full code of the pipline?
I think task.init flag would be great!
👍
Hi @<1689808977149300736:profile|CharmingKoala14> , let me double check that
I suppose the same would need to be done for any
client
PC running
clearml
such that you are submitting dataset upload jobs?
Correct
That is, the dataset is perhaps local to my laptop, or on a development VM that is not in the
clearml
system, but I from there I want to submit a copy of a dataset, then I would need to configure the storage section in the same way as well?
Correct
Hi ImpressionableRaven99
In the UI the re is a download button when you hover over the graph.
Are you asking if there is a programmatic interface?
What is the use case for all experiments ?
I have install a python environment by virtualenv tool, let's say
/home/frank/env
and python is
/home/frank/env/bin/python3.
How to reuse the virtualenv by setting clearml agent?
So the agent is already caching the entire venv for you, nothing to worry about, just make sure you have this line in clearml:
https://github.com/allegroai/clearml-agent/blob/249b51a31bee97d63f41c6d5542e657962008b68/docs/clearml.conf#L131
No need to provide it an existing...
GiddyTurkey39 what do you have in the Task itself
(i.e. git repo uncommitted changes installed packages)
sdk.storage.cache.size.cleanup_margin_percent
Hi ReassuredTiger98
This is actually future proofing the cache mechanism and allowing it be "smarter" i.e. clean based on cache folder size instead of cache folder entries, this is currently not available
sdk.storage.cache
parameters for the agent?
For both local execution and with an agent
When are datasets deleted if I run local execution?
When you hit the cache entry limit (100 if I recall). This can a...
Hi @<1557899668485050368:profile|FantasticSquid9>
There is some backwards compatibility issue with 1.2 (I think).
Basically what you need it to spin a new one on a new session ID and rergister the endpoints
This will mount the trains-agent machine's hosts file into the docker
Hi ContemplativePuppy11
This is really interesting point.
Maybe you can provide a pseudo class abstract of your current pipeline design, this will help in trying to understand what you are trying to achieve and how to make it easier to get there
the separate experiments are not starting back at iteration 0
What do you mean by that?
Hi RoughTiger69
A. Yes makes total sense . Basically you can use Task.export Task.import to do achieve this process (notice we assume the dataset artifacts links are available on both, usually this is the case)
B. The easiest way would be to use Process , then one subprocess is exporting from dev , where the credentials and configuration is passed with os environment. The another subprocess imports it to the prod server (again with os environment pointing to the prod server). Make sense?
Legit, if you have a cached_file (i.e. exists and accessible), you can return it to the caller
Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?
Ok, I think figured it out.
Nice!
ClearML doesn't add all the imported packages needed to run the task to the Installed Packages
It does (but not derivative packages, that are used by the required packages, the derivative packages will be added when the agent is running it, because it creates a new clean venv and then it add the required packages, then it updates back with everything in pip freeze, because it now represents All the packages the Task needs)
Two questions:
Is t...
That somehow the PV never worked and it was all local inside the pod
is this a config file on your side or something I can change, if we had enterprise version?
Yes, this is one of the things you can configure
I see it's a plotly plot, even though I report a matplotlib one
ClearML tries to convert matplotlib into plotly objects so they are interactive, it it fails it falls back into a static image as in matplotlib
ok, yes, but this will install the package of the branch specified there.
Correct
So If im working on my own branch and want to run an experiment, I would have to manually put in the git path my current branch name.
When you say your own branch you mean local (i.e. not pushed to remote git repo) ?
Hi FriendlyKoala70 you can edit the installed package section and add the missing package. See more details on how trains-agent works here (although it's on conda the same rules apply for pip) https://github.com/allegroai/trains-agent/issues/8
Basically it hooks into any torch.save function (monkey patching in realtime)
Very Cool!
BTW guys, are you using the task.models[] to continue from the last checkpoint? or is it task.artifacts[] ?
Hey SarcasticSparrow10 see here 🙂
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_linux_mac.html#upgrading
I did change the
instead of 8080?
So this is the issue