Reputation
Badges 1
96 × Eureka!after adding the
import fastparquet
statement to the code, the reconstruction of an clone is working
` Summary - installed python packages:
...
- fastparquet==0.4.1
...
Environment setup completed successfully
Starting Task Execution:
...
modeller.py: error: the following arguments are required: --algorithm `unfortunately it raises the next issue.
If the script been used expects to get parameters via command line (which in Trains experiments are identified and stored as parameter when using...
regarding the clean-up servide, do I need to run this as cron job, or does the trains server support a kind of add-ons where I need to copy the script to?
the log of the fileserver pod seems quite empty
` root@vmd62521:~# kubectl logs fileserver-6f49b74556-2m4n2 -n trains --all-containers
- Serving Flask app "fileserver" (lazy loading)
- Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead. - Debug mode: off
root@vmd62521:~#same to the agentserviceroot@vmd62521:~# kubectl logs agentservices-56655788b6-rnbk4 apiserver-7d9cd59844-dfd5s -n train...
the one I send you the snippet of the api {} config?
Sorry, but I don'T understand how the cloned experiment is been provided with parameters.
A task which is been cloned by Trains might get its parameter via task.set_parameters(dict)
this parameters are comming from soe magic analysis of the argparse been used in the script.
AgitatedDove14 when is the call to set_parameter(...) been performed? Is the argparse call been somehow redirected and will receive the data from Trains instead of getting them via sys.argv or wherever argparse is gettin...
# TRAINS SDK configuration file api { # Notice: 'host' is the api server (default port 8008), not the web server. api_server: web_server: files_server: `
# Credentials are generated using the webapp, /profile
# Override with os environment: TRAINS_API_ACCESS_KEY / TRAINS_API_SECRET_KEY
credentials {....}
}
sdk {
# TRAINS - default SDK configuration
`
another question I have is, are the models been trained stored (I guess they are stored) in the mongodb or in the file system and which format is been used ?
AgitatedDove14 I would like to publish additional 2 articles handling the use with Docker and Kubernetes. Docker I managed, but my Kubernetes knowledge is quite low. I managed to set-up K3S cluster which might be also worth an article, but I still habe not realy the understanding to add workers with agents to it...
It might take some time till I will write the Kubernetes stuff. Once I'm doing I will let you know
withif task.running_locally(): fig.show()it works 🙂
thanks you for the support
AgitatedDove14 the index astype(str) did the magic 🙂 thanks
Sounds good :) I'm currently trying to run an orca instance ... but without success
the apiserver pods reports quite a lot
🙂 but I still need the laod ballancer ...
nevermind some day I will have it running 😉
Cool
I'm already impressed about what Trains does with just 2 lines of code
Thanks for the twiitter tweet.
The credentials are already deleted
but before I need to understand how parameters are processed. See my last question in my earlier https://app.slack.com/client/TT9ATQXJ5/CTK20V944/thread/CTK20V944-1603740766.425000
AgitatedDove14 The problem I have with getting the ingress running ... seems to be caused by the fact that I'm running rancher in single node mode (using a docker image ...) where the port 80 is already in use so the webservice (WebUI) of trains cannot be mapped to the same port ...
Nevertheless I will continue with a real Kubernets cluster installation and try to get Trains + additional own agents running on it 😉
thanks so far for the support you provided. I will try to collect the i...
AgitatedDove14 unfortunately I still have issues with the plot. After removing the first row I get a wierd empty remote plot where the axis is a counter instead of a date. Seems not to be clearml related and I need to get more in touch with plotly to analyze it.
api_server and web_server look ok(py38) wgo@NVidia-power:~/dev/Trains/trains$ curl {"meta":{"id":"bb5cd73435fb4127b9509ce3a771e95b","trx":"bb5cd73435fb4127b9509ce3a771e95b","endpoint":{"name":"","requested_version":1.0,"actual_version":null},"result_code":400,"result_spath /","error_stack":null},"data":{}}(py38) wgo@NVidia-power:~/dev/Trains/trains$ curl `
<!doctype html>
<html lang="en">
<head> <meta charset="utf-8"> <title>trains</title> <base href="/"> <meta name="vie...
ok, thanks. This is enough information. You don't need to check how much space is provided to the accounts
AgitatedDove14 ok, and how much storage is an account allowed to use? Omce reached, will the oldest experiments been deleted ?

