Reputation
Badges 1
25 × Eureka!does the clearml server is a worker i can serve on models?
The serving is done by one of the clearml-agents.
Basically you spin an agent, then this agent is spinning the model serving engine container (fully managed).
(1) install run run clearml-agent (2) run clearml-session CLI to configure and spin the serving engine
Could you disable the windows anti-virus firewall and test?
Hi ComfortableHorse5
Yes this is more of a suggestion that you should write them using the platform capabilities, the UI implementation is being worked on, as well as a few helpers classes, I thin you'll be able to see a few in the next release 🙂
Seems like a okay clearml.conf
file
Notice this is the error:404
can you curl to this address ? are you sure you have httpS and not http ? was the dns configured ?
No worries 🙂 glad to hear it worked out 🙂
Assuming this is a followup on:
https://clearml.slack.com/archives/CTK20V944/p1626184974199700?thread_ts=1625407069.458400&cid=CTK20V944
This depends on how you set it with the clearml-serving --endpoint my_model_entrycurl <serving-engine-ip>:8000/v2/models/my_model_entry/versions/1
Hi BattyLizard6
Not that I'm aware of, which TF version are you using, and which clearml version?
Hi EnviousStarfish54
Verified with the frontend / backend guys.
Backend allows to search for "all" tags, and frontend will add a toggle button for the UI to select or/all for the selected Tags.
Should be part of the next release
Nice debugging experience
Kudos on the work !
BTW, I feel weird to add an issue on their github, but someone should, this generic setup will break all sorts of things ...
EnviousStarfish54 regrading file server, you have one built into the trains-server, and this will be the default location to store all artifacts. You can also use external solutions like S3 GS Azure etc.
Regarding the models, any model store / load is automatically logged as long as you are using one of the supported frameworks (TF Keras PyTorch scikit learn)
If you want your model to be automatically uploaded, just add outpu_uri:
task=Task.init('examples', 'model', output_uri=' http://trai...
Hi DisturbedWalrus17
This is a bit of a hack, but will work:from clearml.backend_interface.metrics.events import UploadEvent UploadEvent._file_history_size = 10
Maybe we should expose it somewhere, what do you think?
I see, by default it will look for requirements.txt in the root of the repo (the actual repo).
That said in code you can specify the requirements .txt:Task.force_requirements_env_freeze(requirements_file='repo/project-a/requirements.txt') task = Task.init(...)
Notice, you need to call it prior to the Task.init call
ElegantKangaroo44 I think TrainsCheckpoint
would probably be the easiest solution. I mean it will not be a must, but another option to deepen the integration, and allow us more flexibility.
Hi @<1726410010763726848:profile|DistinctToad76>
Why not just report scalars, the x-axis you can use as "iterations" if this is a running in real time to collect the prompts.
If this is a summary then just report a scatter plot (you can also specify the names of the axis and the series)
None
WittyOwl57 could it be the EC2 instance is too small (i.e. not enough storage / memory) ?
Hi DrabCockroach54
This seems like a pip issue trying to install from source, try upgrading the pip version and before installing numpy, it should solve it 🤞
it works if I run the same command manually.
What do you mean?
Can you do:docker run -it <my container here> bash
Then immediately get an interactive bash ?
Hi JitteryCoyote63 ,
upload_artifacts was designed to upload pre made artifacts, which actually covers everything.
With register_artifacts we tried to have something that will constantly log PD artifact, the use case was examples used for training and their order, so we could compare the execution of two different experiments and detect dataset contamination etc.
Not Sure it is actually useful though ...
Retrieving an artifact from a Task is done by:
` Task.get_task(task_id='aaa').artifact...
Hi AstonishingSwan80 , what do you mean by "ec2 API"?
HealthyStarfish45 the pycharm plugin is mainly for remote debugging, you can of course use it for local debugging but the value is just to be able to configure your user credentials and trains-server.
In remote debbugging, it will make sure the correct git repo/diff are stored alongside the experiment (this is due to the fact that pycharm will no sync the .git folder to the remote machine, so without the plugin Trains will not know the git repo etc.)
Is that helpful ?
Hi HealthyStarfish45
- is there an advantage in using tensorboard over your reporting?
Not unless your code already uses TB or has some built in TB loggers.
html reporting looks powerfull, can one inject some javascript inside?
As long as the JS is self contained in the html script, anything goes :)
there was a problem with index order when converting from pytorch tensor to numpy array
HealthyStarfish45 I'm assuming you are sending numpy to report_image (which makes sense) if you want to debug it, you can also test tensorboard add_image or matplotlib imshow. both will send debug images
Hi SuperiorDucks36
you have such a great and clear GUI
😊
I personally would love to do it with a CLI
Actually a lot of stuff are harder to get from UI (like current state of your local repository etc.) But I think your point stands 🙂 We will start with CLI, because it is faster to deploy/iterate, then when you guys say this is a winner we will have a wizard in the UI.
What do you think?
Okay, make sure that in your trains.conf
on all the trains-agent machine you add the following:agent.extra_docker_arguments: ["-v", "/etc/hosts:/etc/hosts",]
Yes, the left side is the location of the file on the host machine, the right side is the location of the file inside the docker. in our case it is the same location