but maybe that doesn't matter, actually - it might be one session per host I guess
if I instead change the request url to f"http://{server_info['hostname']}:{server_info['port']}/api/sessions"
then it gets a 200 response... however , the response is an empty list
it does return kernels, just not sessions
This is strange, let me see if we can get around it, because I'm sure it worked 🙂
What happens when you call:
from clearml.backend_interface.task.repo import ScriptInfo
print(ScriptInfo._ScriptInfo__legacy_jupyter_notebook_server_json_parsing(None))
and cat /var/log/studio/kernel_gateway.log | grep ipynb
comes up empty
and the only calls to "uri": "/api/sessions"
are the ones I made during testing - sagemaker doesn't seem to ever call that itself
environ{'PYTHONNOUSERSITE': '0',
'HOSTNAME': 'gfp-science-ml-t3-medium-d579233e8c4b53bc5ad626f2b385',
'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/_sagemaker-instance-credentials/xxx',
'JUPYTER_PATH': '/usr/share/jupyter/',
'SAGEMAKER_LOG_FILE': '/var/log/studio/kernel_gateway.log',
'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tmp/miniconda3/condabin:/tmp/anaconda3/condabin:/tmp/miniconda2/condabin:/tmp/anaconda2/condabin',
'REGION_NAME': 'us-east-1',
'AWS_INTERNAL_IMAGE_OWNER': 'Custom',
'AWS_DEFAULT_REGION': 'us-east-1',
'PWD': '/home/sagemaker-user',
'AWS_REGION': 'us-east-1',
'SHLVL': '1',
'HOME': '/home/sagemaker-user',
'AWS_SAGEMAKER_PYTHONNOUSERSITE': '0',
'AWS_ACCOUNT_ID': 'xxx',
'_': '/opt/.sagemakerinternal/conda/bin/jupyter-kernelgateway',
'LC_CTYPE': 'C.UTF-8',
'KERNEL_LAUNCH_TIMEOUT': '40',
'KERNEL_WORKING_PATH': '',
'KERNEL_GATEWAY': '1',
'JPY_PARENT_PID': '9',
'PYDEVD_USE_FRAME_EVAL': 'NO',
'TERM': 'xterm-color',
'CLICOLOR': '1',
'FORCE_COLOR': '1',
'CLICOLOR_FORCE': '1',
'PAGER': 'cat',
'GIT_PAGER': 'cat',
'MPLBACKEND': '
_inline'}
right now I can't figure out how to get the session in order to get the notebook path
nice! Just tested it on my end as well, looks like it works!
Hi @<1532532498972545024:profile|LittleReindeer37> @<1523701205467926528:profile|AgitatedDove14>
I got the session with a bit of "hacking".
See this script:
import boto3, requests, json
from urllib.parse import urlparse
def get_notebook_data():
log_path = "/opt/ml/metadata/resource-metadata.json"
with open(log_path, "r") as logs:
_logs = json.load(logs)
return _logs
notebook_data = get_notebook_data()
client = boto3.client("sagemaker")
response = client.create_presigned_domain_url(
DomainId=notebook_data["DomainId"],
UserProfileName=notebook_data["UserProfileName"]
)
authorized_url = response["AuthorizedUrl"]
authorized_url_parsed = urlparse(authorized_url)
unauthorized_url = authorized_url_parsed.scheme + "://" + authorized_url_parsed.netloc
with requests.Session() as s:
s.get(authorized_url)
print(s.get(unauthorized_url + "/jupyter/default/api/sessions").content)
Basically, we can get the session directly from AWS, but we need to be authenticated.
One way I found was to create a presigned url through boto3, by getting the domain id and profile name from a resoure-metadata file that is found on the machine None .
Then use that to get the session...
Maybe there are some other ways to do this (safer), but this is a good start. We know it's possible
the server_info
is
[{'base_url': '/jupyter/default/',
'hostname': '0.0.0.0',
'password': False,
'pid': 9,
'port': 8888,
'root_dir': '/home/sagemaker-user',
'secure': False,
'sock': '',
'token': '',
'url': '
',
'version': '1.23.2'}]
Hmm and you are getting empty list for thi one:
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
Yes, I'm running a notebook in Studio. Where should it be captured?
right now I can't figure out how to get the session in order to get the notebook path
you mean the code that fires "HTTPConnectionPool" ?
This is very odd ... let me check something
curious whether it impacts anything besides sagemaker. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
but the call to jupyter_server.serverapp.list_running_servers()
does return the server
as best I can tell it'll only have one .ipynb in $HOME
with this setup, which may work...
Try to add here:
None
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"