ClearML FAQ | Hello, I'M Using Trains For Logging My Training Script. However, While Using The Logger I'M Getting This: Trains.Task - Warning - ### Task Stopped - User Aborted - Status Changed ### And Eventually The Process Is Killed. If I Disable The Logger, The Proc

Answered

Hello, I'M Using Trains For Logging My Training Script. However, While Using The Logger I'M Getting This: Trains.Task - Warning - ### Task Stopped - User Aborted - Status Changed ### And Eventually The Process Is Killed. If I Disable The Logger, The Proc

Hello, I'm using Trains for logging my training script. However, while using the logger I'm getting this: trains.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ### and eventually the process is killed. If I disable the logger, the process runs flawlessly, although some warnings pop up. Could there be any reason the task is being struggled upon warnings?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

Votes Newest

Answers 12

AgitatedDove14 Updated the Trains version to the mentioned version but it still stops. Regarding exceptions from subprocesses, torchvision doesn't show me any exception that I can handle.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

SoreDragonfly16 notice that if in the web UI you aborting a task it will do exactly what you described, print a message and quit the process. Any chance someone did that?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

SoreDragonfly16 could you reproduce the issue?
What's your OS? trains versions?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

by "disable the logger" I mean not using trains at all, just in order to make sure the process doesn't stop by itself.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

Hi SoreDragonfly16
The warning you mention means that someone state of the experiment was changed to aborted , which in term will actually kill the process.
What do you mean by "If I disable the logger," ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Hey, I just reproduce this. Whenever it happens, I also get a warning from torchvision:
/home/koe1tv/anaconda3/envs/torch/lib/python3.7/site-packages/torchvision/io/video.py:105: UserWarning: The pts_unit 'pts' gives wrong results and will be removed in a follow-up version. Please use pts_unit 'sec'.Unfortunately, I can't suppress this warning because I don't have access to the parameter mentioned in the warning.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

SoreDragonfly16 could you test with Task.init using reuse_last_task_id=False for example:
task = Task.init('project', 'experiment', reuse_last_task_id=False)The only thing that I can think of is running two experiments with the same project/name on the same machine, this will ensure every time you run the code, you create a new experiment.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Also SoreDragonfly16 could you test with if the issue exists with trains==0.16.2rc0 ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks for your support. My OS is Ubuntu 18.04.5 LTS and the trains version is 0.16.0. I can't run this code right now as my machine runs some other heavy stuff right now, but I'll try reproducing this as soon as It finishes

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

Probably not

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

SoreDragonfly16 the torchvision warning has nothing to do with the Trains warning.
The Trains warning means that somehow someone changes the state of the Task from running (in_progress) to "stopped" (aborted). Could it be one of the subprocesses raised an exception ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

For example, here are the two last log lines from my process:
2020-09-11 18:34:50 /home/koe1tv/anaconda3/envs/torch/lib/python3.7/site-packages/torchvision/io/video.py:105: UserWarning: The pts_unit 'pts' gives wrong results and will be removed in a follow-up version. Please use pts_unit 'sec'.--
2020-09-11 18:34:52 2020-09-11 08:34:52,109 - trains.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SoreDragonfly16
				
					0
					 × 1

Write your answer

2K Views

12 Answers

4 years ago

2 years ago