GrievingTurkey78
maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?).
If these are derivative packages (i.e. imported by other packages) they are not automatically logged when executing the Task manually (in order to keep the "installed packages as lean as possible on the one hand but specify also specify the important packages for you)
That said, when the "trains-agent" executed the task it will store nack...
Is it not possible to serve a model with preprocessing pipeline from scikit-learn using clearml-serving?
of course it is, did you first try the example , here: None
If you need to run your own LogisticRegression
call you can use this example:
None
Notice this is where the custom endpoint actually calls the prediction: [None](https...
sudo curl -L "
-s)-$(uname -m)" -o /usr/local/bin/docker-compose
(with older clearml versions thoughβ¦).
Yes, we added content type header for the files when uploading to S3 (so it is easier for users to serve them back). But it seems the python 3.5 casting from Path to str breaks it mimetype call....
MotionlessCoral18 so did it solve the issue ?
SmallDeer34
I think this is somehow related to the JIT compiler torch is using.
My suspicion is that JIT cannot be initialized after something happened (like a subprocess, or a thread).
I think we managed to get around it with 1.0.3rc1.
Can you verify ?
Nicely found @<1595587997728772096:profile|MuddyRobin9> !
we made two tb versions of / task and wrote in parallel.
And I wanted to know if it is possible here as well.
Basically you will have different series (based on the TB log file) on the same graph so you can compare π all automatically
The problem is that even when I mount the SSH key into the root home directory (e.g.,
/root/.ssh/id_rsa
with the correct permissions set to 400) I still encounter the same error.
The agent automatically mount's the .ssh folder from the host into the container, making sure all the permissions are set,
how can I run
pip install -e .
in general the agent will add the "working" dir into the PYTHONPATH so that you should not have to manually do "-e ."
Tha...
ShallowGoldfish8 this call does that:
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/examples/advanced/execute_remotely_example.py#L127
BTW: from the instance name it seems like it is a VM with preinstalled pytorch, why don't you add system site packages, so the venv will inherit all the preinstalled packages, it might also save some space π
DeterminedToad86 see here:
https://github.com/allegroai/clearml-agent/blob/0462af6a3d3ef6f2bc54fd08f0eb88f53a70724c/docs/clearml.conf#L55
Change it on the agent's conf file to:system_site_packages: true
HugePelican43 sure you can, usually the limiting factor is memory, as it cannot be shared among processes, so if one allocated all memory the second process will crash with out of memory error
No, an old experiment changed, nothing was rerun
ohh, that is odd. I think the max iteration value is stored on the DB, which is odd if it changed after an update.
BTW: just making sure, could it be these Tasks were imported ? (i.e. offline execution + import)
What's the "working directory" ?
What's the trains-agent version?
(yes this should have worked, as long as the package "test" is there)
I want to be able to delete only the logs since they are taking a lot of space in my case.
I see... I do not think this is possible π
You can disable the auto logging though ... pass auto_connect_streams=False
to Task.init
How so? they are in one place? the creation of the venv is transparent, and the packages that are there are everything you have in the docker, plus the ability to override them from the UI.
What am I missing here ?
I mean this blob is then saved on the fs
It can if you do:temp_file = task.connect_configuration('/path/to/config/file', name='configuration object is a config file')
Then temp_file is actually a local copy of the text coming from the Task.
When running in manual mode the content of '/path/to/config/file' is stored on the Task When running remotely by the agent, the content from the Task is dumped into a temp file and the path to the file is returned in temp_file
NICE! MoodyCentipede68 this is awesome π
I see, by default it will look for requirements.txt in the root of the repo (the actual repo).
That said in code you can specify the requirements .txt:Task.force_requirements_env_freeze(requirements_file='repo/project-a/requirements.txt') task = Task.init(...)
Notice, you need to call it prior to the Task.init call
I notice that, in my Serving Service situated in the DevOps project, the "endpoints" section doesn't seem to get updated when I tag a new model with "released".
It takes it a few minutes (I think 5 min is the default) to update.
Notice that you need to add the model with
model auto-update --engine triton --endpoint "test_model_pytorch_auto" ...
Not with model add (if for some reason that does not work please let me know)
No need to pass the model version i.e. 1
you can ...
Lol, :)
I think the issue is that you do not need to manually set the initial iteration, it's supposed to get it , as it is stored on the Task itself
hmmm, somehow I have a bed feeling about it... Could you check the log, it should say something like "Collecting torch==1.6.0.dev20200421+cu101 from https://"
It should be right at the top of the installation. What do you have there?
ElegantCoyote26 point me to where Keras stores the data π
If in the process of integration you had to add a logger/callback to your Keras code, that is the equivalent of using the TB.
clearml launches a subprocess
correct, this subprocess is used fgor resource monitoring and sending logs in the background (i.e metrics console etc.)
Where does the "training" part coming from? I'm assuming the training is your main code?
Follow up, is this happening when running manually or when executed via the agent ?
WackyRabbit7 basically starting v1.1 if you are running code without any configuration file, you will get an error (in contrast to previous versions where it defaulted to the demo-server)
LOL yes π
just make sure it won't be part of the uncommitted changes of the AWS autoscaler π
WickedElephant66 this seems like a general network issue, like the docker service is missing your companies firewall certificate.
Can you pull any container from docker hub ?