It will store the entire content of the file, then you can edit it in the UI, and in remote it will return a new local copy of the file (based on the data in the UI) for you to read.
Hi TrickyRaccoon92
BTW: checkout the HP optimization example, it might make things even easier π https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py
Did you meantΒ
--detached
Β ?
Oops yes sorry you are correct should be --detached π
EnviousStarfish54 regrading file server, you have one built into the trains-server, and this will be the default location to store all artifacts. You can also use external solutions like S3 GS Azure etc.
Regarding the models, any model store / load is automatically logged as long as you are using one of the supported frameworks (TF Keras PyTorch scikit learn)
If you want your model to be automatically uploaded, just add outpu_uri:
task=Task.init('examples', 'model', output_uri=' http://trai...
Ohh I see, makes total sense. I'm assuming the code base itself can do both π
I couldn't change the task status from draft to complete
Task.completed(ignore_errors=True)
what's the error/reply ?
Hmm maybe we should add a test once the download is done, comparing the expected file size and the actual file size, and if they are different we should redownload ?
(currently I think the implementation expects that if the download completed, it was successful)
BTW:
Error response from daemon: cannot set both Count and DeviceIDs on device request.
Googling it points to a docker issue (which makes sense considering):
https://github.com/NVIDIA/nvidia-docker/issues/1026
What is the host OS?
Okay, I'll make sure we always qoute "
, since it seems to work either way.
We will release an RC soon, with this fix.
Sounds good?
Hi @<1724235687256920064:profile|LonelyFly9>
So, I noticed that with the REST API at least the
/tasks.get_all
endpoint appears to have an undocumented maximum page size of 500.
Yeah otherwise the request size might be too big, but you have pagination:
page
optional Page number, returns a specific page out of the resulting list of tasks
Minimum value : 0 integer
is the base Task a file or a notebook ?
@<1523701066867150848:profile|JitteryCoyote63>
I just created a new venv and run
pip install "torch==1.11.0.*" --extra-index-url
Then started python:
import torch
torch.cuda.is_available()
And I get True
what are you getting?
Is there any documentation on versioning for Datasets?
You mean how to select the version name ?
SmarmyDolphin68 , All looks okay to me...
Could you verify you still get the plot on debug samples as image with the latest trains RCpip install trains==0.16.4rc0
Quite hard for me to try this right
π
How do I reproduce it ?
"General" is the parameter section name (like Args)
Hi LackadaisicalOtter14
Is it possible to remove this line to stop it from being executed
Everything is possible π II think the main question is why it is there (which ti the best of my understanding, is to solve for any cuda drivers and installed packages, meaning anything that is installed in runtime)
I think we can suppress the error, wdyt?'echo "ldconfig" 2>/dev/null >> /etc/profile && '
JitteryCoyote63 the new wizard was pushed, you can check it out here:
https://github.com/allegroai/trains/blob/master/examples/services/aws-autoscaler/aws_autoscaler.py
BTW: next release to include it all is next week (hopefully :))
Hi CharmingBeetle38
On the base task, do you see those arguments under the Configuration tab?
Also, if they are under Args section, you should add "Args/" prefix to the HP optimization (this is how you differentiate between the sections)
CharmingBeetle38 try adding "General/" before the arguments. This means batch_size becomes General/batch_size. This is only because we are accessing the parameters externally, when the task is executed it is resolved automatically
YummyMoth34
It tried to upload all events and then killed the experiment
Could you send a log?
Also, what's the train package version ?
No, it is zipped and stored, so in order to open the zipfile and read the files you have to download them.
That said everything is cached, so if the machine already downloaded the dataset there is zero download / unzipping,
make sese?
Can you see the repo itself ? the commit id ?
Ohh I see.
In your web app, look for the "?" icon (bottom left corner), click on it, it should open the full platform documentation