Reputation
Badges 1
25 × Eureka!BTW:
If I try to find the right model in the
task.models["output"]
(this time there is just one but in my code there may be several) it appears with the
(see other attached screenshot).
What would make sense here ? (I have to be honest I'm not sure).
To be specific there is "model name" which is not unique , and there is model-key which is unique to the Task (i.e. task.models["output"]["model-key"]
)
Just making sure, after the pipe
object is created, you can call Task.current_task() , is that correct?
s like the
would be a really good starting place.
This is actually JS (typescript) ... not python, not sure on how to continue from there 😞
My pleasure, and apologies 🙂
I see,
@<1571308003204796416:profile|HollowPeacock58> can you please send the full log?
(The odd thing is it is trying to install the python 3.10 version of torch, when your command line suggest it is running python 3.8)
Hi @<1661542579272945664:profile|SaltySpider22> I'm not sure I understand the answer to my parallel quesion
Could it be you have some custom SSL certificate installed, or policy ?
can you get other https sites? (for example your clearml-server)
Hmm yes, that is a good point, maybe we should allow to specify a parameter on the model configuration to help with the actual type ...
I don't know how I would be able to get the description and name?
Good point, how about doing that in code, then you have all the information and you can store it in jsons / pickle next to the data folder?
wdyt?
Just to make sure, the first two steps are working ?
Maybe it has to do with the fact the "training" step specifies a docker image, could you try to remove it and check?
BTW: A few pointers
The return_values
is used to specify multiple returned objects stored individually, not the type of the object. If there is a single object, no need to specify
The parents
argument is optional, the pipeline components optimizes execution based on inputs, for example in your code, all pipeline comp...
Let's start small. Do you have grafana enabled in your docker compose and can you login to your grafana web ui?
Notice grafana needs to access the prometheus container directly so easiest way is to have everything in the same docker compose
VivaciousPenguin66 I have the feeling it is the first space in the URI that breaks the credentials lookup.
Let's test it:from clearml import StorageManager uri = '
` Birds%2FTraining/TRAIN [Network%3A resnet34, Library%3A torchvision] Ignite Train PyTorch CNN on CUB200.8611ada5be6f4bb6ba09cf730ecd2253/models/cub200_resnet34_ignite_best_model_0.pt'
original
StoargeManager.get_local_copy(uri)
qouted
StoargeManager.get_local_copy(uri.replace(' ', '%20')) `
GrievingTurkey78 short answer no 😞
Long answer, the files are stored as differentiable sets (think changes set from the previous version(s)) The collection of files is then compressed and stored as a single zip. The zip itself can be stored on Google but on their object storage (not the GDrive). Notice that the default storage for the clearml-data is the clearml-server, that said you can always mix and match (even between versions).
Hi @<1532532498972545024:profile|LittleReindeer37>
This is truly a great discussion to have. Personally I think the main difference is that software development is a somewhat linear process , and git captures it very well. But ML is a lot wider nonlinear process, which to me means that trying to conform the same workflow into a Dev tree will end up failing. The way ClearML thinks about it (and I think the analogy to source control is correct ) is probably closer to how you think about proj...
os.environ['CLEARML_PROC_MASTER_ID'] = ''
Nice catch! (I'm assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)
I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation
So isit working now? everything is there ?
On my to do list, but will have to wait for later this week (feel free to ping on this thread to remind me).
Regrading the issue at hand, let me check the requirements it is using.
ScantChimpanzee51 what's the use case for the full path without specific artifact?
PompousParrot44 That should be very easy to do, basically a service mode code that clones a base task and puts it into a queue:
This should more or less do what you need :)
` from trains import Task
task = Task.init('devops', 'daily train', task_type='controller')
stop the local execution of this code, and put it into the service queue, so we have a remote machine running it.
task = execute_remotely('services')
while True:
a_task = Task.clone(base_task_id='aaabb111')
Task.enqueu...
Hmm good point, it should probably return he clearml python version. Is this what you mean?
When I passed specific arguments (for example --steps) it ignored them...
script.py test blah1 blah2 blah3 42
Is this how it is intended to be used ?
Oh you achieve exactly the same with plotly and te restapi/python interface.
Basically pull data from tasks , create visualization and log it on one if the Task or on a new one
SubstantialElk6
Hmm do you have torch in the "installed packages" section of the Task ?
(This what the agent is using to setup the environment inside the docker, running as a pod)
Is there a way I could move the JWT authentication (not authorization) logic into an API Gateway or Load Balancer?
Hmm in theory, but not in practice 😞
if ClearML is following OAuth 2.0, t
This is for the SSO part, not for the API, API is only using JWT for verification, the login process itself is with external SSO (OAuth 2.0). But the open-source version does not support SSO 😞
Why are you trying to add another ELB with JWT verification on it ? ...
Hi WhimsicalLion91
You can always explicitly send a value:from trains import Logger Logger.current_logger().report_scalar("title", "series", iteration=0, value=1337)
A full example can be found here:
https://github.com/allegroai/trains/blob/master/examples/reporting/scalar_reporting.py
Hmm, I think "it" misses the fact callbacks
are not a package.
Any chance you can post the code here? (or DM me)