Reputation
Badges 1
147 × Eureka!clearml==1.5.0
WebApp: 1.5.0-192 Server: 1.5.0-192 API: 2.18
looking into the output folder of catboost, I see 3 types of metrics outputs:
tfevents (can be read by tensorboard) catboost_training.json (custom (?) format). Is read here to be shown as an ipython widget: https://github.com/catboost/catboost/blob/c2a6ed0cb85869a73a13d08bf8df8d17320f8215/catboost/python-package/catboost/widget/ipythonwidget.py#L93 learn_error.tsv, test_error.tsv, time_left.tsv which have the same data as json. Apparently they are to be used with this stale metrics viewer pr...
for some reason, when I ran it previous time, then repo, commit and working dir were all empty
but this will be invoked before fil-profiler starts generating them
[.]$ /root/.clearml/venvs-builds/3.8/bin/python -u '/root/.clearml/venvs-builds/3.8/code/-m filprofiler run catboost_train.py'
I am not registering a model explicitly in apply_model . I guess it is done automatically when I do this:output_models = train_task_with_model.models["output"] model_descriptor = output_models[0] model_filename = model_descriptor.get_local_copy()
and I have no way to save those as clearml artifacts
yes, but note that Iām not talking about VS Code instance set up be clearml-session, but about a local one. Iāll do another test to determine whether VS Code from clearml-session suffers from the same problem
but this time they were all present, and the command was run as expected:
now the problem is: fil-profiler persists the reports and then exits
nope, I need to contact devops team for that, that can happen not earlier than Monday
no new unremovable entries have appeared (although I havenāt tried)
we certainly modified some deployment conf, but lets wait for answers tomorrow
You have two options
I think both can work but too much of a hassle. I think Iāll skip extracting the common code and keep it duplicated for now
I want to have 2 instances of scheduler - 1 starts reporting jobs for staging, another one for prod
worked fine, thanks!
as I understand this: even though force=false, my script is importing another module from same project and thus triggering analyze_entire_repo
not sure I fully get it. Where will the connection between task and scheduler appear?
Hereās my workaround - ignore the fail messages, and manually create an SSH connection to the server with Jupyter port forwarded.
this is where the āmagicā happens
Then I ssh into the remote machine using ngrok hostname and tunnel the port for Jupyter
example here: https://github.com/martjushev/clearml_requirements_demo
do you want a fully reproducible example or just 2 scripts to illustrate?
I donāt see these lines when requirement deducing from imports happen.
I already added to the task:Workaround: Remove limit_execution_time from scheduler.add_task
I found this in the conf:# Default auto generated requirements optimize for smaller requirements # If True, analyze the entire repository regardless of the entry point. # If False, first analyze the entry point script, if it does not contain other to local files, # do not analyze the entire repository. force_analyze_entire_repo: false
But the second problem hints that we need to change Dict[datetime, str] -> Dict[str, datetime] or do some custom processing before serialization
I think we can live without mass deleting for a while