Reputation
Badges 1
25 × Eureka!can the ClearML File server be configured to any kind of storage ? Example hdfs or even a database etc..
DeliciousBluewhale87 long story short, no π the file server, will just store/retrieve/delete files from a local/mounted folder
Is there any ways , we can scale this file server when our data volume explodes. Maybe it wouldnt be an issue in the K8s environment anyways. Or can it also be configured such that all data is stored in the hdfs (which helps with scalablity).I would su...
TrickyRaccoon92
I guess elegant is the challenge π
What exactly is the use case ?
Wait I might be completely off.
Is this line "hangs" ?
task.execute_remotely(..., exit_process=True)
I would just add git+
None to your requirements (either in the requirements.txt or even better as part of the pipeline/component where you also specify the repo to be used)
The agent will automatically push the crednetilas when it installs the repo as wheel.
wdyt?
btw: you might also get away with adding -e .
into the requirements.txt (but you will need to test that one)
JitteryCoyote63 fix should be pushed later today π
Meanwhile you can manually add the Task.init() call to the original script at the top, it is basically the same π
JitteryCoyote63
Should be added before theΒ
if name == "main":
?
Yes, it should.
From you code I understand it is not ?
What's the clearml
version you are using ?
WackyRabbit7 How do I reproduce it ?
I think task.init flag would be great!
π
I'm really for adding an interface, but I was not able to locate a simple integration option with basically anything, Wdyt ?
Makes sense
we need to figure what would be the easiest way to have an "opt-in" for the demo server, that will still make it a breeze to quickly test code integration ...
Any suggestions are welcomed π
SubstantialElk6 "Execution Tab" scroll down you should have "Installed Packages" section, what do you have there?
DefeatedOstrich93 can you verify lightning actually only stored once ?
Hi @<1523701066867150848:profile|JitteryCoyote63>
RC is out,
pip3 install clearml-agent==1.5.3rc3
Then in pytorch_resolve: "direct"
None
Let me know if it worked
The notebook path goes through a symlink a few levels up the file system (before hitting the repo root, though)
Hmm sounds interesting, how can I reproduce it?
The notebook kernel is also not the default kernel,
What do you mean?
Cloud Access section is in theΒ
Profile
Β page.
Any storage credentials (S3 for example) are only stored on the client side (never the trains-server), this is the reason we need to configure them in the trains.conf. When the browser needs to access those URL's (downloading an artifact) it also needs the secret/key, it automatically display a popup requesting them, and will store them in this section. Notice they are stored on the browser session (as a cookie).
Martin, if you want, feel free to add your answer in the stackoverflow so that I can mark it as a solution.
Will do π give me 5
OutrageousGrasshopper93 could you send an example of the two links from the artifacts (one local one remote) ?
Thanks OutrageousGrasshopper93
I will test it "!".
By the way the "!" is in the project or the Task name?
Hey IntriguedRat44 ,
Is this what you are after?
https://github.com/allegroai/trains/issues/181
I see.
You can get the offline folder programmatically then copy the folder content (it's the same as the zip, and you can also pass a folder instead of zip to the import function)task.get_offline_mode_folder()
You can also have a soft link of the offline folder (if you are working on a linux machine:ln -s myoffline_folder ~/.trains/cache/offline
CooperativeFox72 btw, are you guys running those 20 experiments manually or through trains-agent ?
It manages the scheduling process, so no need to package your code, or worry about building dockers etc. It also has an AWS autoscaler, that spins ec2 instances based on the amount of jobs you have in the execution queue, and the limit of your budget (obviously spinning down machines that are idle)
Maybe permissions?!
you can test it manually by installing pynvml
and running:from pynvml.smi import nvidia_smi nvsmi = nvidia_smi.getInstance() nvsmi.DeviceQuery('memory.free, memory.total')
BoredGoat1
Hmm, that means it should have worked with Trains as well.
Could you run the attached script, see if it works?
Hi ProudMosquito87
so you mean to mount your data folder onto the the docker so that the code could access it, correct?
If that is the case, is there a specific version not to use absolute path? (e.g. /mnt/data/mine
)?
WackyRabbit7 I guess we are discussing this one on a diff thread π but yes, should totally work, that's the idea
ShallowGoldfish8 this call does that:
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/examples/advanced/execute_remotely_example.py#L127