in which config file we need to add proxies, I am using squirro plugin dataloader
Hi @adarsh
Proxies can be configured through the configuration files. Specifically, you can edit /etc/squirro/common.ini
and add the proxy configuration as follows:
[proxy]
proxy = http://proxy.mycorp.com:8080
no_proxy = 127.0.0.1,localhost,127.0.0.1:81,localhost:81
This particular configuration will enable the indicated proxy for all requests, except any requests to the listed exceptions (localhost nodes in this case).
Note that all services will need to be restarted after any change of this configuration. You can use the squirro_restart
command to restart the Squirro services.
Options
Option | Description |
---|---|
proxy |
Proxy used for HTTP and HTTPS requests. |
http_proxy |
Proxy used for HTTP requests. Only used if proxy is not specified. |
https_proxy |
Proxy used for HTTPS requests. Only used if proxy is not specified. |
no_proxy |
Comma-separated list of hostname suffixed for which the proxy should not be consulted. This will usually contain data-centre domain names for which the proxy is either not needed or even a hindrance. |
I have also encountered a similar issue @pneff.
I can confirm that the common.ini settings have been updated to reflect the above information. However, when trying to fetch the document preview from the dataloading screen, I seem to get the following error: ValueError: check_hostname requires server_hostname
The full stack trace can be found below:
MainThread datasourced[2060] 2022-03-08 07:15:14,742 INFO Start load from squirro_plugin
MainThread datasourced[2060] 2022-03-08 07:15:14,770 ERROR Exception: (None, ValueError('check_hostname requires server_hostname',))
Traceback (most recent call last):
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 276, in _perform_authentication
r = session.post(url, data=data, timeout=self.timeout_secs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connection.py", line 359, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connection.py", line 506, in _connect_tls_proxy
ssl_context=ssl_context,
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 773, in __init__
raise ValueError("check_hostname requires server_hostname")
ValueError: check_hostname requires server_hostname
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro/dataloader/sq_data_load.py", line 165, in load_from_source
source.connect(self.config.incremental_column, max_inc_value)
File "/var/lib/squirro/topic/assets/dataloader_plugin/_global/squirro_plugin/squirro_plugin.py", line 113, in connect
self.client.authenticate(refresh_token=self.args.source_token)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 241, in authenticate
self._perform_authentication(dict(base, **data))
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 281, in _perform_authentication
raise ConnectionError(None, ex)
squirro_client.exceptions.ConnectionError: (None, ValueError('check_hostname requires server_hostname',))
MainThread squirro_plugin 2022-03-08 07:15:14,772 INFO The max inc value is stored in MySQL as 2022-03-01T14:48:05
MainThread squirro.service.datasource.background 2022-03-08 07:15:14,772 ERROR Exception invoking dataloader
Traceback (most recent call last):
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 276, in _perform_authentication
r = session.post(url, data=data, timeout=self.timeout_secs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connection.py", line 359, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/connection.py", line 506, in _connect_tls_proxy
ssl_context=ssl_context,
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 773, in __init__
raise ValueError("check_hostname requires server_hostname")
ValueError: check_hostname requires server_hostname
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro/service/datasource/background.py", line 240, in _process_dataloader_task
max_inc_value=source["max_inc_value"],
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro/service/datasource/dataload.py", line 95, in fetch_process_and_upload_items
max_inc_value=max_inc_value, cli_mode=False
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro/dataloader/sq_data_load.py", line 511, in execute_load_only
upload_rows=True,
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro/dataloader/sq_data_load.py", line 165, in load_from_source
source.connect(self.config.incremental_column, max_inc_value)
File "/var/lib/squirro/topic/assets/dataloader_plugin/_global/squirro_plugin/squirro_plugin.py", line 113, in connect
self.client.authenticate(refresh_token=self.args.source_token)
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 241, in authenticate
self._perform_authentication(dict(base, **data))
File "/opt/squirro/virtualenv3/lib/python3.6/site-packages/squirro_client/base.py", line 281, in _perform_authentication
raise ConnectionError(None, ex)
squirro_client.exceptions.ConnectionError: (None, ValueError('check_hostname requires server_hostname',))
Any tips on how best to proceed? I can confirm that the server is indeed able to access the proxy via a simple curl request (via exporting the HTTPS_PROXY variable) and the hostname being used is in a normal format.
Following up on above, I was able to resolve the issue by using the https_proxy
key instead of the proxy
key. It’s also important to mention that when specifying the https_proxy
key/value it’s also necessary to specify the http_proxy
key/value as well.
Thank you for the update @peter.brejza