Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
What Sort Of Integration Is Possible With Clearml And Sagemaker? On The Page

What sort of integration is possible with ClearML and SageMaker? On the page describing ClearML Remote it says:

Create a remote development environment (e.g. AWS SageMaker, GCP CoLab, etc.) on any on-prem machine or any cloud.

But the only mention of SageMaker I see in the docs is the release notes for 0.13 saying "Add support for SageMaker".

I have SageMaker Studio up and running with access to my ClearML server and it's successfully able to log plots and scalars from experiments, but in terms of code it just logs the code used to launch the kernel:

"""Entry point for launching an IPython kernel.
This is separate from the ipykernel package so we can avoid doing imports until
after removing the cwd from sys.path.
"""
import sys

if __name__ == '__main__':
    # Remove the CWD from sys.path while we load stuff.
    # This is added back by InteractiveShellApp.init_path()
    if sys.path[0] == '':
        del sys.path[0]
    from ipykernel import kernelapp as app
    app.launch_new_instance()

Is it possible to capture more than that while using SageMaker?

  
  
Posted one year ago
Votes Newest

Answers 77


nice! Just tested it on my end as well, looks like it works!

  
  
Posted one year ago

Hi @<1532532498972545024:profile|LittleReindeer37> @<1523701205467926528:profile|AgitatedDove14>
I got the session with a bit of "hacking".
See this script:

import boto3, requests, json
from urllib.parse import urlparse

def get_notebook_data():
    log_path = "/opt/ml/metadata/resource-metadata.json"
    with open(log_path, "r") as logs:
        _logs = json.load(logs)
    return _logs

notebook_data = get_notebook_data()
client = boto3.client("sagemaker")
response = client.create_presigned_domain_url(
    DomainId=notebook_data["DomainId"],
    UserProfileName=notebook_data["UserProfileName"]
)
authorized_url = response["AuthorizedUrl"]
authorized_url_parsed = urlparse(authorized_url)
unauthorized_url = authorized_url_parsed.scheme + "://" + authorized_url_parsed.netloc
with requests.Session() as s:
    s.get(authorized_url)
    print(s.get(unauthorized_url + "/jupyter/default/api/sessions").content)

Basically, we can get the session directly from AWS, but we need to be authenticated.
One way I found was to create a presigned url through boto3, by getting the domain id and profile name from a resoure-metadata file that is found on the machine None .
Then use that to get the session...
Maybe there are some other ways to do this (safer), but this is a good start. We know it's possible

  
  
Posted one year ago

the server_info is

[{'base_url': '/jupyter/default/',
  'hostname': '0.0.0.0',
  'password': False,
  'pid': 9,
  'port': 8888,
  'root_dir': '/home/sagemaker-user',
  'secure': False,
  'sock': '',
  'token': '',
  'url': '
',
  'version': '1.23.2'}]
  
  
Posted one year ago

Hmm and you are getting empty list for thi one:

server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
  
  
Posted one year ago

Yes, I'm running a notebook in Studio. Where should it be captured?

  
  
Posted one year ago

yep

  
  
Posted one year ago

right now I can't figure out how to get the session in order to get the notebook path

you mean the code that fires "HTTPConnectionPool" ?

  
  
Posted one year ago

This is very odd ... let me check something

  
  
Posted one year ago

curious whether it impacts anything besides sagemaker. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet

  
  
Posted one year ago

but the call to jupyter_server.serverapp.list_running_servers() does return the server

  
  
Posted one year ago

the problem is here: None

  
  
Posted one year ago

if I add the base_url it's not found

  
  
Posted one year ago

as best I can tell it'll only have one .ipynb in $HOME with this setup, which may work...

  
  
Posted one year ago

Try to add here:
None

server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
  
  
Posted one year ago

so notebooks ends up empty

  
  
Posted one year ago

if I change it to 0.0.0.0 it works

  
  
Posted one year ago

so notebook path is empty

  
  
Posted one year ago
43K Views
77 Answers
one year ago
one year ago
Tags
Similar posts