Reputation
Badges 1
25 × Eureka!Hi SourOx12
I think that you do not actually need this one:step = step - cfg.start_epoch + 1
you can just dostep += 1
ClearML Will take care of the offset itself
It looks somewhat familiar ... 😞
SuccessfulKoala55 any idea?
DilapidatedDucks58 trains-agent adds the artifactory URL as --extra-index-url , are you sure you are getting the correct torch version in the container? because the torch html is not an artifactory html, it is a list of links, I just want to make sure you are getting the correct version, because otherwise it can default to the CPU version, which we don't want 🙂 anyhow you can use the direct link in the "installed packages and just put there " https://download.pytorch.org/whl/nightly/cu101...
Quick update Nexus supports direct http upload, which means that as CostlyOstrich36 mentioned, just pointing to the Nexus http upload endpoint would work:output_uri="http://<nexus>:<port>/repository/something/"
See docs:
https://support.sonatype.com/hc/en-us/articles/115006744008-How-can-I-programmatically-upload-files-into-Nexus-3-
AstonishingRabbit13
https://github.com/googleapis/google-cloud-python/issues/4941#issuecomment-369472576
check the openssl and the date, this seems like SSL low level error (even before authentication)
Not intentional! When I launched the AMI it was running an older version
I think this is exactly the reason they decided to change the location 🙂 so you will have to manually upgrade, reasoning is we changed directory names (maybe a few more things)
Yes shutdown the current docker copse curl the new docker compose rename folder spin it up againFull instructions here:
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_aws_ec2_ami.html#upgrading
Hi SteadyFox10 , this one will get all the last metric scalarstrain_logger.get_last_scalar_metrics()
it's in the docker image, doesn't the git clone command run in the container
Then this should have worked.
Did you pass in the configuration: force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/e93384b99bdfd72a54cf2b68b3991b145b504b79/docs/clearml.conf#L25
Ok, so it doesn't follow the exact same rules as
Task.init
?
Correct
I was afraid all the logs and outputs of a hyperparameter optimization task would be deleted just because no artifacts were created. (edited)
Should not happen 🙂
Hi @<1523702932069945344:profile|CheerfulGorilla72>
Please tell me what RAM metric is tracked by ClearML?
Free RAM is the entire machine free RAM
Yeah htop shows odd numbers as it doesn't "count" allocated buffers
specifically you can see the code here:
None
Three options:
In your code: Task.init(..., output_uri='s3://.../'
2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"
In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/
docstring ?
Usually the preferred way is StorageManager
https://clear.ml/docs/latest/docs/references/sdk/storage
https://clear.ml/docs/latest/docs/integrations/storage
Thanks CynicalBee90 I appreciate the discussion! since I'm assuming you will actually amend the misrepresentation in your table, let me followup here.
1.
SPSS license may be a significant consideration for some, and so we thought it was important to point this out clearly.
SPSS is fully open-source compliant unless you have the intention of selling it as a service, I hardly think this is any users consideration, just like anyone would be using mongodb or elastic search without think...
PompousBeetle71 notice that starting with this version when you set model tags they will be stored as user tags , which you can change and edit in UI. So if you still need the system tags you have to access them directly.
Thanks for checking NastyFox63
I double checked with both front/backend , there should not be any limit...
Could you maybe provide a toy demo to reproduce the issue ?
FreshKangaroo33 you can:from time import time Task.query_tasks(..., task_filter=dict(started=['<{}'.format(datetime.utcfromtimestamp(time())), ]))
I think this should work
ngrok to connect to the remote server at the office?
That makes sense, I guess this is the equivalent of using a VPN, from that point onward clearml-session can directly access the remote machine, right?
No worries, I'll see what I can do 🙂
ImmensePenguin78 it might be... Let me check, worst case sync after the weekend 🙂
(pypi does contain 1.2.0rc4 and we are finalizing tests so that we can release a stable 1.2.0)
HealthyStarfish45 you mean like replace the debug image viewer with custom widget ?
For the images themselves, you can get heir urls, then embed that in your static html.
You could also have your html talk directly with the server REST API.
What did you have in mind?
I think the clearml-session CLI is missing the ability to add cutom port to the external address, does that make sense ?
I think RoughTiger69 was discussing this exact scenario
https://clearml.slack.com/archives/CTK20V944/p1629885416175500?thread_ts=1629881415.172600&cid=CTK20V944
wdyt?
https://github.com/allegroai/clearml/issues/199
Seems already supported for a while now ...
Does adding external files not upload them ti the dataset output_uri?
@<1523704667563888640:profile|CooperativeOtter46> If you are adding the links with add_external_files
these files are Not re-uploaded
Hi CurvedDolphin95
I would first check the free space on the instance (it might be that git is reporting an inaccurate error and it's free space not permission that causing it to fail the clone).
I would also check your GitHub account, notice that the now only support user/api-key (and not user/pass), which means you need to create an api-key and add it as your password in the clearml.conf.
Any chance that for some reason some of the Tasks are running from a diff user? or not using a docker ?
SmarmyDolphin68
BTW: there is no automatic reporting when you have task = Task.get_task(task_id='your_task_id')
It's only active when you have one "main" task.
You can also check the continue_last_task
argument in Task.init , it might be a good fit for your scenario
https://allegro.ai/docs/task.html#trains.task.Task.init
that's the entire repo link ? not something like https://github.com/ ... ?