If the problem consists (i.e. trains failing to detect packages, please open a GitHub Issue so the bug will not get lost 🙂
We actually added a specific call to stop the local execution and continue remotely , see it here: https://github.com/allegroai/trains/blob/master/trains/task.py#L2409
Yes actually that might be it. Here is how it works,
It launch a thread in the background to do all the analysis of the repository, extracting all the packages.
If the process ends (for any reason), it will give the background thread 10 seconds to finish and then it will give up. If the repository is big, the analysis can take longer, and it will quit
I ran it till the end first time and it ended successfully but i put the trains commend line on comment so i didn't see it on trains,,, so i ran it again and stoped it at the middle...
if i stopped the running of the Original one before it ended, can it be the problem?
It should be the last line (or almost) of the Log. is it there ? Also it seems that from the log, that trains you are using trains 0.14.3 , try with trains 0.15 , let me know if you are still missing packages
and it continues... but nothing about time out
in the "original one"'s log i see:
TRAINS Task: created new task id=2eec56cb60e9441897a4af9c10f656c0
TRAINS new package available: UPGRADE to v0.15.0 is recommended!
Release Notes:
Features
- Add automation support including hyper-parameters optimization (see example here )
Task.init()
auto_connect_arg_parser
argument can accept a dictionary disabling specific keys from the argparser (Trains Slack channel thread )- Allow
worker_id
override usingTRAINS_WORKER_NAME
environment variable (Trains Slack channel thread ) - Support layout configuration for plotly objects using
extra_layout
argument in allLogger
reporting methods #136 - Add
Task.execute_remotely()
to allow cloning and enqueuing a locally executed task (or stopping and re-enqueuing a remotely executed task) #128 - Add Parquet framework and model type
- Support recursive model folder packaging
- Add
Task.get_reported_console_output()
andTask.get_reported_scalars()
to allow retrieval of reported reported output and scalar metrics - Add
Task.add_requirements()
to force requirement package into "installed packages" - Improve task reuse responsiveness
- Add
raise_on_error
(defaultFalse
) argument toModel.get_local_copy()
andArtifact.get_local_copy()
https://github.com/allegroai/trains-agent/issues/17 - Support
Task.get_task()
without project name (i.e. all projects) - Support using the file calling
Task.init()
as the task's script in casesys.argv
doesn't point to a git repository - Support detecting and remotely executing code running from a module (i.e.
-m module
) - Add callback for framework
save
/load
binding for better integration with pytorch/ignite https://github.com/pytorch/ignite/issues/1056 - Support new task types provided in Trains Server v0.15.0
- Add automation and distributed examples
- Upgrade default pip version to
<20.2
Bug Fixes - Fix
exact_match_regex()
in case of empty pattern #138 - Address deprecation warning and newer
attrs
versions inMetricsEventAdapter
#134 - Fix issues with plotly support (Trains Slack channel thread and thread )
- Fix default argument behavior to match argparse behavior
- Fix
OutputModel
withtask=None
should use current task, if exists - Fix
Task.get_task()
to raise proper error on incorrecttask_id
- Fix
Task.enqueue()
to use an exact queue name match - Fix
NaN
,Inf
and-Inf
values display in reported table (not supported by JSON) - Limit max requirement size to 0.5MB
- Fix issues with repository analysis
- Fix
StorageManager
should only try to extract .zip files,Model
should not auto extract package https://github.com/allegroai/trains-agent/issues/17
TRAINS results page: http://trains.agro-scout.com:8080/projects/071e139ad6084f2493ff00fe9181f825/experiments/2eec56cb60e9441897a4af9c10f656c0/output/log
inference
in the clone I see "Environment setup completed successfully" and nothing about timeout
because it should have detected it...
Did you see "Repository and package analysis timed out ..."
and do you have import tensorflow in your code?
do you have git repo link in the execution section of the experiment ?
"Trains will analyze the entire repository if this is a git repo" it is a git repo...
PlainSquid19 Trains will analyze the entire repository if this is a git repo code, and a single script file if there is no repository found.
It will not analyze an entire folder if it is not in a git repository, because it will not be able to recreate this folder anyhow. Make sense ?
@ https://app.slack.com/team/UT8T0V3NE ok so I see the packages that are listed are the ones that are imported on the python script that runs but it doesn't include the ones it needs from imports from other inner classes... I guess it is the way it works, so i need to specify all the packages that are not on the py script i run? (all the packages that the inner classes are using)
@ https://app.slack.com/team/UT8T0V3NE any idea for what is the reason they are not listed there?
OK I guess you ment the installed packages on the execution tab and i looked at the log.. so there i see the following:
Python 3.7.5 (default, Nov 7 2019, 10:50:52) [GCC 8.3.0]
argparse_utils == 1.3.0
trains == 0.14.3