PompousHawk82

10 Questions, 108 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

108 × Eureka!

Questions 10
Answers 108

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi Guys, I’M Trying To Install It My Lab Server, But When I Try To Create Credentials, It Says Error And Gives More Info: Error 301 : Invalid User Id: Id=F46262Bde88B4928997351A657901D8B, Company=D1Bd92A3B039400Cbafc60A7A5B1E52B

Hi guys, i’m trying to install it my lab server, but when i try to create credentials, it says error and gives more info: Error 301 : Invalid user id: id=f46...

clearml

4 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Another Problem Is That When Using Mp.Spawn To Init Distributed Training In Pytorch,

Another problem is that when using mp.spawn to init distributed training in pytorch, Task.current_task().get_logger() on worker process will throw 'NoneType'...

clearml

4 years ago

0 Votes

23 Answers

2K Views

0 Votes 23 Answers 2K Views

I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

I got a quick question of using custom conf for clearml agent, i tried this line clearml-agent --config-file ~/clearml-iris.conf clearml-agent --config-file ...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey, After I Update To Clearml 1.0.2, It Seems The Server Cannot Record The Log Data

Hey, After i update to clearml 1.0.2, it seems the server cannot record the log data

clearml

4 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Just A Quick Question: How Can I Pull Off The Scaler Data Json From Server Without Downloading Them One By One?

Just a quick question: how can i pull off the scaler data json from server without downloading them one by one?

clearml

4 years ago

0 Votes

17 Answers

2K Views

0 Votes 17 Answers 2K Views

Hi, I’M Trying To Setup The Trains Server On My Own Computer And Everything Is Fine Until I Need To Create New Credentials, Is Pop Up A Window Say “Unable To Create Credentials” I’M Not Sure If I Missed Something And How Can I Check? Btw, I Couldn’T Run

Hi, i’m trying to setup the trains server on my own computer and everything is fine until i need to create new credentials, is pop up a window say “Unable to...

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi, I’M Currently Running Clearml With Pytorch And Everytime I Run Into

Hi, i’m currently running clearml with pytorch and everytime i run into torch.load(os.path.join(root, self.feat_pt))there’ll be a message: Task connect, seco...

pytorch

4 years ago

0 Votes

17 Answers

2K Views

0 Votes 17 Answers 2K Views

Hi All, Is There Any Way To Recover The Experiment On Demo Host? I Download All The Logs But Seems It Only Contain The Console Log Not The Scaler Report. This Is A Very Long Time Training And Very Important To My Project, Hope Any One Can Help Me Find It.

Hi all, is there any way to recover the experiment on demo host? I download all the logs but seems it only contain the console log not the scaler report. Thi...

clearml

4 years ago

0 Votes

33 Answers

134K Views

0 Votes 33 Answers 134K Views

Dear Developers, I Encountered A Question That The Local Module Cannot Be Found When Pulling Task From Queue. I Opened A Issue Here

Dear developers, I encountered a question that the local module cannot be found when pulling task from queue. I opened a issue here https://github.com/allegr...

clearml

3 years ago

0 Votes

46 Answers

132K Views

0 Votes 46 Answers 132K Views

<image>

clearml

4 years ago

0 Hi Guys, I’M Trying To Install It My Lab Server, But When I Try To Create Credentials, It Says Error And Gives More Info: Error 301 : Invalid User Id: Id=F46262Bde88B4928997351A657901D8B, Company=D1Bd92A3B039400Cbafc60A7A5B1E52B

yes

4 years ago

I’ve been added multi-node support for my code, and i found our lab seems only have shared user file because i installed trains on one node, but it doesn’t appear on the others

4 years ago

i’m just curious about how does trains server on different nodes communicate about the task queue

4 years ago

we all use conda, guess not need for docker

4 years ago

Yeah, i’m done with the test, not i can run as what you said

4 years ago

0 <image>

i can only see the init log

4 years ago

0 I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

not actually

4 years ago

i see, now we are trying to let the agent pop up the experiment separately and see if they can communicate with each other, right?

4 years ago

0 I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

i’m not sure if i use the command correctly

4 years ago

but the thing is that i can only use master to log everything

4 years ago

it’s shared but only user files, everything under ~/ directory

4 years ago

yes

4 years ago

never done this before, let me do a quick search

4 years ago

0 Hi, I’M Trying To Setup The Trains Server On My Own Computer And Everything Is Fine Until I Need To Create New Credentials, Is Pop Up A Window Say “Unable To Create Credentials” I’M Not Sure If I Missed Something And How Can I Check? Btw, I Couldn’T Run

but the solution in the answer doesn’t help cause when i do reverse with -R the server couldn’t be brought up

4 years ago

0 Another Problem Is That When Using Mp.Spawn To Init Distributed Training In Pytorch,

I think this is not related to pytorch, because it shows the same problem with mp spawn

4 years ago

0 <image>

now it has log, but only the initial one

4 years ago

i tired to run trains-compose without -d to say the log,
trains-agent-services | trains_agent: ERROR: Connection Error: it seems api_server is misconfigured. Is this the TRAINS API server http://apiserver:8008 ?
trains-agent-services | http://192.5.53.86:8081 http://192.5.53.86:8080 http://apiserver:8008
I didn’t assign anything to TRAINS_HOST_IP, not sure if the apiserver:8008 caused the problem

4 years ago

0 I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

It only works that we set the CLEARML_CONFIG_FILE before script running

4 years ago

0 Hi, I’M Currently Running Clearml With Pytorch And Everytime I Run Into

Yes, i think trains might wrap the torch.load function, but the thing is that i need to load some part of the dataset using torch.load, so this error shows up many time during training, I found i can use this line:
task = Task.init(project_name="Alfred", task_name="trains_plot", auto_connect_frameworks={'pytorch': False})but does it mean i cannot monitor torch.load function any more?

4 years ago

Appreciated it!

4 years ago

0 Just A Quick Question: How Can I Pull Off The Scaler Data Json From Server Without Downloading Them One By One?

I think so, let me give it a try, btw, I just found server API but not sure how to use it, for example /debug.ping, should i post request on “ http://localhost:8080/debug/ping ” or “ http://localhost:8080/debug.ping ”?

4 years ago

0 Just A Quick Question: How Can I Pull Off The Scaler Data Json From Server Without Downloading Them One By One?

Thanks, i’ll give it a try

4 years ago

Not for now, i think it can only run on multiple GPU at one node

4 years ago

0 <image>

and one experiment takes 40 hours to run, so i let them run in parallel

4 years ago

0 I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

i tried to add environment right before importing clearml, but it doesn’t work as expected
os.environ['CLEARML_CONFIG_FILE'] = str(Path.home()/f"clearml-{socket.getfqdn()}.conf") from clearml import Task Task.init(project_name="Alfred", task_name="finalized", auto_connect_frameworks={'pytorch': False})

4 years ago

i’m trying to install it my lab server, but the same problem happen, when i try to create credentials, it say error but this time it give more info:
Error 301 : Invalid user id: id=f46262bde88b4928997351a657901d8b, company=d1bd92a3b039400cbafc60a7a5b1e52b

4 years ago

0 I Got A Quick Question Of Using Custom Conf For Clearml Agent, I Tried This Line

before i renamed it, i can log the experiment successfully, i basically add task.init to the python script then just run that script

4 years ago

0 Hi All, Is There Any Way To Recover The Experiment On Demo Host? I Download All The Logs But Seems It Only Contain The Console Log Not The Scaler Report. This Is A Very Long Time Training And Very Important To My Project, Hope Any One Can Help Me Find It.

for now i only have these

4 years ago

I found server API here https://allegro.ai/clearml/docs/rst/references/clearml_api_ref , but not sure how to use it, for example /debug.ping, should i post request on “ http://localhost:8080/debug/ping ” or “ http://localhost:8080/debug.ping ”?

4 years ago

Then access the 8008 through the tunnel

4 years ago

Show more results