Reputation
Badges 1
25 × Eureka!Generally speaking, for the exact reason if you are passing a list of files, or a folder, it will actually zip them and upload the zip file. Specifically to pipeline it should be similar. BTW I think you can change the number of parallel upload threads in StorageManager, but as you mentioned it is faster to zip into one file. Make sense?
what do you see in the console when you start the trains-agent , it should detect the cuda version
and I install the tar
I think the only way to do that is add it into the docker bash setup script (this is a bash script executed before Task)
basically PVC for all the DBs 🙂
for example, if I somehow start the execution of an agent task in a specific docker container?)
You mean to specify the container from code? or to make sure the agent can access private docker container registry ? Or is it for private pypi container repository ?
Does this require you run the pipeline locally (I see you have set default execution queue) or do any other specific set-up?
Yes this mean the pipeline Logic runs manually/locally (logic means launching components, not actually compute)
Please have a go at it, I'm sure some quirks in the psuedo code are missing but it should work, and I'll gladly help set it up
SmallDeer34 I have to admit this reference is relatively old, maybe we should update to auther http://clearml.ml (would that make sense ?)
Yes, but I'm not sure that they need to have separate task
Hmm okay I need to check if this can be easily done
(BTW, the downside of that, you can only cache a component, not a sub-component)
In that case when you create the Tasks for the step,do not specify any packages/requirements, then the agent will just use the "requirements.txt" from the repository.
If you need you can also specify them when you create the Task itself see https://github.com/allegroai/clearml/blob/912f6f5ba2328b26de042de03f02de5802df360f/clearml/task.py#L608
https://github.com/allegroai/clearml/blob/912f6f5ba2328b26de042de03f02de5802df360f/clearml/task.py#L609
task=Task.current_task()
Will get me the task object. (right?)
PanickyMoth78 yes, always, from anywhere, this is a singleton object 🙂
Hmm StrangePelican34
Can you verify you call Task.init before TB is created ? (basically at the start of everything)
ShallowGoldfish8 how did you get this error?self.Node(**eager_node_def) TypeError: __init__() got an unexpected keyword argument 'job_id'
Any specific use case for the required "draft" mode?
So what is the difference ? both running from the same machine ?
Click on the "k8s_schedule" queue, then on the right hand side, you should see your Task, click on it, it will open the Task page. There click on the "Info" Tab, there look for "STATUS MESSAGE" and "STATUS REASON". What do you have there?
Hi PanickyMoth78PipelineDecorator.set_default_execution_queue('default')
Would close the current process and launch the pipeline logic on the "serices" queue. Which means the local process is being terminated (specifically in your case the notebook kernel). Does that make sense ?
If you want the pipeline logic to stay on the local machine you can say:@PipelineDecorator.pipeline(..., pipeline_execution_queue=None)
I think the easiest way is to add another glue instance and connect it with CPU pods and the services queue. I have to admit that it has been a while since I looked at the chart but there should be a way to do that
👍
Okay But we should definitely output an error on that
Hi @<1645597514990096384:profile|GrievingFish90>
You mean the agent itself inside a docker then the agent spins sibling dockers for the Tasks ?
Should be under Profile -> Workspace (Configuration Vault)
Could it be someone deleted the file? this is inside the temp venv folder but it should not get there
Hi @<1523701601770934272:profile|GiganticMole91>
to use https although the scheduled task is using ssh for git?
Sure as long as it has git_user / git_pass configured in the agents clearml.conf it will automatically convert ssh to http git pull
None
A few epochs is just fine
AstonishingRabbit13
https://github.com/googleapis/google-cloud-python/issues/4941#issuecomment-369472576
check the openssl and the date, this seems like SSL low level error (even before authentication)
should I update nodejs in centos image ?
I think so, it might have been forgotten
EnviousStarfish54
plt.show will capture the figure, that if you call it multiple times, it will add a running number to the figure itself (because the figure might change, and you might want the history)
if you call plt.imshow, it's the equivalent of debug image, hence it will be shown in the debug-samples tab, as an image.
Make sense ?
I just tested the master with https://github.com/jkhenning/ignite/blob/fix_trains_checkpoint_n_saved/examples/contrib/mnist/mnist_with_trains_logger.py on the latest ignite master and Trains, it passed, but so did the previous commit...
Task.add_requirements does not handle it (traceback in the thread). Any suggestions?
Hmm that is a good point maybe we should fix that 🙂
I'm assuming someone already created this module? Or is it part of the repository?
(if it than the assume this executed from the git root)
Hi SuperiorCockroach75
You mean like turning on caching ? What do you mean by taking too long?