Reputation
Badges 1
25 × Eureka!This is cleaml python client, no need to change the server
Oh this is so internally, the background thread can signal it is not deferred, are you saying there is bug or the code is odd?
Ohh, then yes, you can use the https://github.com/allegroai/clearml/blob/bd110aed5e902efbc03fd4f0e576e40c860e0fb2/clearml/automation/monitor.py#L10 class to monitor changes in the dataset/project
Itβs only on this specific local machine that weβre facing this truncated download.
Yes that what the log says, make sense
Seems like this still doesnβt solve the problem, how can we verify this setting has been applied correctly?
hmm exec into the container? what did you put in clearml.conf?
check if the fileserver docker is running with docker ps
Maybe that's the issue :
https://github.com/googleapis/python-storage/issues/74#issuecomment-602487082
Just curious about the timeout, was it configured by clearML or the GCS? Can we customize the timeout?
I'm assuming this is GCS, at the end the actual upload is done GCS python package.
Maybe there is an env variable ... Let me google it
Thatβs the question i want to raise too,
No file size limit
Let me try to run it myself
Hi JitteryCoyote63
If you want to stop the Task, click Abort (Reset will not stop the task or restart it, it will just clear the outputs and let you edit the Task itself) I think we witnessed something like that due to DataLoaders multiprocessing issues, and I think the solution was to add 'multiprocessing_context='forkserver' to the DataLoaderhttps://github.com/allegroai/clearml/issues/207#issuecomment-702422291
Could you verify?
When I passed specific arguments (for example --steps) it ignored them...
script.py test blah1 blah2 blah3 42
Is this how it is intended to be used ?
Hi TrickySheep9
So basically the idea is you can quickly code a scheduler with your own logic, then launch is on the "services queue" to run basically forever π
This could be a good example:
https://github.com/allegroai/clearml/blob/master/examples/services/monitoring/slack_alerts.py
https://github.com/allegroai/clearml/blob/master/examples/automation/task_piping_example.py
Okay that makes sense, if this is the case I'm assuming you have set the files server to point to your S3 bucket is that correct ?
could it be you are missing the credentials for that (it is trying to upload the preprocessing code there, so the clearml-serving container would be able to pull it later)
Hi ReassuredTiger98
Could you add some print ? before / after the artifact upload?
Also what's the clearml version you are using ?
Is this caused by running the script with the arguments
Yep π
Could you send me the cosnole log of both tasks, failing and passing one?
At the top there should be the URL of the notebook (I think)
You are doing great π don't worry about it
PungentLouse55 you can find the metrics in the "original" (aka base template) experiment.
I want to optimizer hyperparameters with trains.automation but: ...
Yes you are correct, in case of the example code, it should be "General/..." if you have ArgParser, it should be "Args/..." Yes it looks like the metric is wrong, it should be "epoch_accuracy" & "epoch_accuracy"
hi ElegantCoyote26
but I can't see any documentation or examples about the updates done in version 1.0.0
So actually the docs are only for 1.0... https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving
Hi there, are there any plans to add better documentation/example
Yes, this is work in progress, the first Item on the list is custom model serving example (kind of like this one https://github.com/allegroai/clearml-serving/tree/main/examples/pipeline )
about...
Hi @<1619505588100665344:profile|GrievingHare27>
My understanding is that initiating a task with
Task.init()
captures the code for the entire notebook. I'm facing difficulties when attempting to build a final training pipeline (in a separate notebook) that uses only certain functions from the other notebooks/tasks as pipeline steps.
Well this is is kind of the limit of working with jupyter notebooks, referencing code from one to another is not really feasible (of co...
Wait who is creating this file? I thought you remove it in the uncommitted changes
Hi NastyFox63
What do you mean not all of them are shown?
Do they have diff series/titles, are they plots or scalars ? How are you reporting them ?
Let me rerun the code and check
NastyFox63 ask SuccessfulKoala55 tomorrow, I think there is a way to change the default settings even with the current version.
(I.e. increase the default 100 entries limit)
Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?
remote repository
's lock file.
Which file is that? the poetry lock of the internal VCS lock (the agent itself)
if project_name is None and Task.current_task() is not None: project_name = Task.current_task().get_project_name()
This should have fixed it, no?