Ok, I think figured it out.
Nice!
ClearML doesn't add all the imported packages needed to run the task to the Installed Packages
It does (but not derivative packages, that are used by the required packages, the derivative packages will be added when the agent is running it, because it creates a new clean venv and then it add the required packages, then it updates back with everything in pip freeze, because it now represents All the packages the Task needs)
Two questions:
Is t...
Just making sure, after the pipe
object is created, you can call Task.current_task() , is that correct?
Hi VastShells92022-12-20 12:48:02,560 - clearml.automation.optimization - WARNING - Could not find requested hyper-parameters ['duration'] on base task a6262a151f3b454cba9e22a77f4861e3
Basically it is telling you it is setting a parameter it never found on the original Task you want to run the HPO o.
The parameter name should be (based on the screenshot) "Args/duration" (you have to add the section name to the HPO params). Make sense ?
See here:
https://download.pytorch.org/whl/torch_stable.html
cu110/* has no torch 1.3.1 only 1.7.0
To automate the process, we could use a pipeline, but first we need to understand the manual workflow
Hmm, conda_freeze
in the clearml.conf on the development machine ?
Creating a dataset sounds like a good idea, but that does not seem to be the issue.
Can you verify you can manually clone using the same link (notice the log should specify the exact clone it is using, with the password replaced with *)
I double checked with the guys this issue was fixed in 1.14 (of clearml server). It should be released tomorrow / weekend
Hi HandsomeCrow5 .
Remember the debug images are events with links to the actual images, so you first have to get the events and then you can download the images with https://allegro.ai/docs/examples/examples_storagehelper/#storagemanager (which by definition has the credentials, because it was able to upload them 🙂
To get the events:from trains.backend_api.session.client import APIClient client = APIClient() client.events.debug_images(task='aabbcc')
Ohh, then yes, you can use the https://github.com/allegroai/clearml/blob/bd110aed5e902efbc03fd4f0e576e40c860e0fb2/clearml/automation/monitor.py#L10 class to monitor changes in the dataset/project
sdk.conf will add it to the default loaded values (as I think you deduced).
can copy paste the sdk.conf here? (maybe something is missing there?)
It's a good abstraction for monitoring the state of the platform and call backs, if this is what you are after.
If you just need "simple" cron, then you can always just loop/sleep 🙂
I've seen that the file location of a task is saved
What do you mean by that? is it the execution section "entry point" ?
If you are using the latest RC:pip install clearml==0.17.5rc5
You can pass True
it will use the "files_server" as configured in your clearml.conf
I used the http link as a filler to point to the files_server.
Make sense ?
OutrageousGrasshopper93 could you send an example of the two links from the artifacts (one local one remote) ?
EnviousStarfish54 you can use Use Task.set_credentials
Notice that OS environment or trains.conf will override the programmatic credentials
https://allegro.ai/docs/task.html#trains.task.Task.set_credentials
Hi ZippyAlligator65
You mean like env vars?
upload_artifact
will actually do two things:
upload the file to the trains-server register it as an artifact on the experiment
What did you mean by "register the artifact manually"? You still need to upload the file to the trains-server (so it is later accessible )
on the host machine or inside the containers that are spinning on the host machine ?
Hi FunnyTurkey96
Any chance you can try to run with the latest form GitHub (i just tested your code and it seemed to work on my machine).pip install git+
I call
Task.init
after I import tensorflow (and thus tensorboard?)
That should have worked...
Can you manually add a TB report before calling opennmt
function ?
(I want to verify the Task.init is indeed catching the TB calls, my theory is that somewhere inside the opennmt
we loose the TB)
It should be the last line (or almost) of the Log. is it there ? Also it seems that from the log, that trains you are using trains 0.14.3 , try with trains 0.15 , let me know if you are still missing packages
We actually added a specific call to stop the local execution and continue remotely , see it here: https://github.com/allegroai/trains/blob/master/trains/task.py#L2409
But what I get with
get_local_copy()
is the following path: ...
Get local path will return an immutable copy of the dataset, by definition this will not be the "source" storing the data.
(Also notice that the dataset itself is stored in zip files, and when you get the "local-copy" you get the extracted files)
Make sense ?