I did change the
instead of 8080?
So this is the issue
ReassuredTiger98
(for some reason it kind of jumps over PyTorch, but then installs torchvision?!)
Could you run with the latest with --debug
We just added but you will have to install from git:pip3 install git+
Then run with --debug:clearml-agent --debug daemon ...
You can make reports on experiments with interactive graphs
Yes, I can totally see how this is a selling point. The closest is the Project Overview (full markdown capabilities, with the ability to embed links to specific experiments). You can also add a "leader metric", so you can track the project performance/progress over time.
I have to admit that creating a better reporting tool is always pushed down in priority as I think this is a good selling point to management but the actual ...
Hi ProudMosquito87
My apologies there is still no concrete ETA ...
That said I think a good toy example would really help accelerate this process.
How about opening a PR with a nice hydra example, then we can start discussing implementation details based on the toy example ?
I see, by default it will look for requirements.txt in the root of the repo (the actual repo).
That said in code you can specify the requirements .txt:Task.force_requirements_env_freeze(requirements_file='repo/project-a/requirements.txt') task = Task.init(...)
Notice, you need to call it prior to the Task.init call
Click on the "k8s_schedule" queue, then on the right hand side, you should see your Task, click on it, it will open the Task page. There click on the "Info" Tab, there look for "STATUS MESSAGE" and "STATUS REASON". What do you have there?
WackyRabbit7 I guess we are discussing this one on a diff thread 🙂 but yes, should totally work, that's the idea
Hi
The Squash operation copies all the data and is no longer linked to previous commits?
Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed
I thought this operation is like git squash but it seems to me
yeah... we did not want to actually delete the parents because unlike git, the operation is done ...
You can always specify diff clearml.conf files with --config-file 🙂
Feels like we've been over this
LOL, I think I can't wrap my head around the use case 🙂
When running locally, this is "out of the box", as we can init and close before and after each model.
I finally got it! Task.init
should be dubbed "init Main task" , automagic kicks in Only when it is the only one existing. You remote execution is "linear" Task after Task, in theory a good candidate for pipeline.
Basically option (2) , the main task is being "replaced" (which loca...
LovelyHamster1 Now I see... Interesting credentials ability. Specifically all the S3 access on trains is derived from the ~/clearml.conf
credentials section :
https://github.com/allegroai/clearml/blob/ebc0733357ac9ead044d0ed32d41447763f5797e/docs/clearml.conf#L73
( or the AWS S3 environment variables )
I'm not sure how this AWS feature works, I suspect it is changing the AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY variables on the ec2 instance. If this is the case, it should work out of...
Hi @<1570583227918192640:profile|FloppySwallow46>
Not sure I follow, could you explain ?
so I assume clearml moves them from one queue to the other?
Correct. When it creates the k8s job and launches it on the cluster it moves it into the queue.
Can you see it on your k8s cluster (meaning the job/pod)?
That makes sense to me, what do you think about the following:
` from clearml import PipelineDecorator
class AbstractPipeline(object):
def init():
pass
@PipelineDecorator.pipeline(...)
def run(self, run_arg):
data = self.step1(run_arg)
final_model = self.step2(data)
self.upload_model(final_model)
@PipelineDecorator.component(...)
def step1(self, arg_a):
# do something
return value
@PipelineDecorator.component(...)
def step2(self, arg_b):
# do ...
Hi ExuberantBat52
I do not think you can... i would use aws secret manager to push the entire user list config file wdyt?
SweetGiraffe8 Works when I'm using plotly...
Can you please copy paste the code with the plotly, it's probably something I'm missing
Hi @<1533619725983027200:profile|BattyHedgehong22>
Can you elaborate ? what do you mean params file ?
Is this something like:
Task.current_task().connect_configuration('my_conf.json', name="my conf file")
So I'd create the queue in the UI, then update the helm yaml as above, and install? How would I add a 3rd queue?
Same process?!
Also I'd like to create the queues pragmatically, is that possible?
Yes, you can, you can also pass an argument for the agent to create the queue if it does not already exist, just add --create-queue
to the agent execution commandline
EnormousWorm79 you mean to get the DAG graph of the Dataset (like you see in the plots section)?
My apologies you are correct 1.8.1rc0 🙂
If i point directly to the data.yaml the training starts without any problem
what do you mean? how do you know where the extracted file is?
basically:
data_path = Dataset.get(...).get_local_copy()
then you should be able to open your file with open(data_path + "/data.yaml", "rt")
doe that work?
Hi @<1686547375457308672:profile|VastLobster56>
where are you getting stuck? are you getting any errors ?
CheerfulGorilla72
yes, IP-based access,
hmm so this is the main downside of using IP based server, the links (debug images, models, artifacts) store the full URL (e.g. http://IP:8081/ http://IP:8081/... ) This means if you switched IP they will no longer work. Any chance to fix the new server to the old IP?
(the other option is somehow edit the DB with the links, I guess doable but quite risky)
thought the agent created a new conda env and installed all packages
It does, but I was asking what is written on the Original Task (the one created when you executed the code on your laptop, not when the agent was executing it, when the agent is executing the Task, it writes back All the packages of the entire venv it created, when the Task is run manually, it will list only the packages you import directly (i.e. from package or import package, it actually analyses the code)
My point...
yes
argument saying always create from code
can be helpful
@<1523701523954012160:profile|ShallowCormorant89> any chance you can open a github issue on that, just so we do not forget ?
if we can edit the configuration objects of a pipeline, that can be beneficial too. which we're unable to do from UI
Actually you already can, after you clone the pipeline, you can press on details then go to configuration Tab, and edit the pipeline object. The format is HOCON (...
It has to be alive so all the "child nodes" could report to it....