Reputation
Badges 1
147 × Eureka!ideally, I want to hardcode, e.g. use_staging = True, enqueue it; and then via clone-edit_user_properties-enqueue in UI start the second instance
like replace a model in staging seldon with this model from clearml; push this model to prod seldon, but in shadow mode
we are just entering the research phase for a centralized serving solution. Main reasons against clearml-serving triton are: 1) no support for kafka 2)no support for shadow deployments (both of these are supported by Seldon, which is currently the best=looking option for us)
for some reason, when I ran it previous time, then repo, commit and working dir were all empty
so probably, my question can be transformed into: “Can I have control over what command is used to start my script on clearml-agent”
we certainly modified some deployment conf, but lets wait for answers tomorrow
yeah, I think I’ll go with schedule_function right now, but your proposed idea would make it even clearer.
gotcha, thanks!
but the old ones are there, and I can’t do anything about them
self-hosted. Just upgraded to latest version today (1.1.1). The problem appeared when we were still using 1.0.2
or somehow, we can centralize the storage of S3 credentials (i.e. on clearml-server) so that clients can access s3 through the server
Adding venv into cache: /root/.clearml/venvs-builds/3.8 Running task id [aa2aca203f6b46b0843699d1da373b25]: [.]$ /root/.clearml/venvs-builds/3.8/bin/python -u '/root/.clearml/venvs-builds/3.8/code/-m filprofiler run catboost_train.py'
I found this in the conf:# Default auto generated requirements optimize for smaller requirements # If True, analyze the entire repository regardless of the entry point. # If False, first analyze the entry point script, if it does not contain other to local files, # do not analyze the entire repository. force_analyze_entire_repo: false
but this time they were all present, and the command was run as expected:
I think we need logging here: https://github.com/allegroai/clearml-session/blob/bf1851cd3831c19cc0eadd9b2ffc0613f97f16e1/clearml_session/main.py#L564
haven’t tested it within decorator pipelines, but try
Logger.current_logger()
But here you can see why it didn’t succeed
Maybe it makes sense to use schedule_function instead of schedule_task_id and then the schedule function will perform the cloning of the last task and starting the clone?
I guess you can easily reproduce it by cloning any task which has an input model - logs, hyperparams etc are being reset, but inputmodel stays.
worked fine, thanks!
log:[2021-09-09 11:22:09,339] [8] [WARNING] [clearml.service_repo] Returned 400 for tasks.dequeue in 2ms, msg=Invalid task id: id=28d2cf5233fe41399c255950aa8b 8c9d,company=d1bd92a3b039400cbafc60a7a5b1e52b
now the problem is: fil-profiler persists the reports and then exits
I am not registering a model explicitly in apply_model . I guess it is done automatically when I do this:output_models = train_task_with_model.models["output"] model_descriptor = output_models[0] model_filename = model_descriptor.get_local_copy()
is it possible to override this?
Then I ssh into the remote machine using ngrok hostname and tunnel the port for Jupyter